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Chapter  1 


Introduction 


1.1  Motivation 

Circuit  designs  with  highly  regular  and  repetitive  layouts  are  an  effec¬ 
tive  solution  to  the  VLSI  design  bottleneck,  and  therefore  occur  quite  often 
in  large  VLSI  systems.  Familiar  examples  of  regular  circuit  structures  are 
RAMs,  ROMs,  PLAs,  and  array  multipliers.  In  addition,  recognition  of  the 
importance  of  regularity  in  VLSI  systems  has  given  rise  to  a  large  and  con¬ 
tinually  growing  collection  of  new  regular  structures  for  applications  in  signal 
processing,  image  processing,  data  structures,  and  CAD,  to  name  a  few.  Since 
these  designs  are  computationally  powerful  and  widely  applicable,  there  is  a 
great  demand  for  circuit  design  tools  that  make  these  structures  generally  ac¬ 
cessible.  This  thesis  describes  a  CAD  tool,  the  Regular  Structure  Generator 
(RSG),  that  helps  meet  this  demand  by  performing  automatic  generation  of 
regular  structure  layouts  and  providing  the  means  to  efficiently  capture,  in 
all  their  richness  and  variety,  most  practical  regular  circuit  designs. 

Despite  the  uniform  and  repetitive  appearance  of  their  layouts,  effective 
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regular  structure  circuits  are  not  simply  bland  arrays  of  identical,  abutting 
cells.  In  practice,  there  is  always  some  degree  of  complexity  along  the  edges  of 
a  regular  array,  and  each  design  instance  must  be  parametrically  personalized 
with  respect  to  problem  size  and  functionality.  This  requires  the  placement  of 
a  variety  of  cell  maskings  that  implement  such  options  as  transistor  and  bus 
sizing,  cell  interfacing,  clock  assignment,  and  functional  encoding  —  a  task 
which  cannot  be  accomplished  by  the  simple  array  generating  commands 
found  in  graphics  editors.  Although  regularity  does  permit  most  regular 
structures  to  be  personalized  in  an  algorithmic  manner,  a  high  degree  of 
flexibility  is  still  required  in  the  placement  and  orientation  of  the  cells  and 
cell  maskings.  Insofar  as  first  generation  VLSI  layout  tools  lack  this  high 
degree  of  flexibility,  there  is  an  opportunity  for  developing  more  advanced 
module  generators  that  fulfill  this  need. 

The  RSG  was  developed  with  this  approach  to  regular  circuit  layout  in 
mind.  The  input  language  used  for  the  procedural  specification  of  circuit 
architecture  is  a  subset  of  Lisp.  Consequently,  abstraction  mechanisms  are 
available  to  support  a  highly  functional  set  of  primitives  for  defining  regular 
structures  and  evaluating  the  complex  conditionals  required  by  personaliza¬ 
tion  and  edge  effects.  Personalization  is  further  supported  by  the  ability  to 
arbitrarily  place  and  orient  cells  according  to  interfaces  dcfined-by- example 
in  the  graphical  domain.  All  design  information  is  efficiently  partitioned  into 
procedural  and  graphical  form. 

A  circuit  layout  is  generated  from  the  following  inputs  (Figure  1.1):  a 
design  file ,  which  is  a  parameterized,  procedural  description  of  the  archi¬ 
tecture:  a  layout  file ,  which  is  a  graphical  specification  of  cell  layouts  and 
interfaces:  and  a  parameter  file,  which  provides  the  size  and  functional  speci- 


Circuit 

Layout 


Figure  1.1:  RSG  Layout  Generation 

fications  for  the  particular  case.  By  completely  decoupling  the  graphical  and 
procedural  domains,  a  level  of  modularity  is  obtained  which  achieves  local 
efficiency  in  layout  generation,  and  global  efficiency  in  the  management  of 
new  architectures,  layouts,  and  interfaces  to  other  CAD  tools. 

The  RSG  also  supports  macro  abstraction,  i.e.  the  specification  of  macro¬ 
cells  as  interconnections  of  smaller  cells  whose  binding  to  actual  layouts  can 
be  delayed  to  any  desired  time.  In  addition,  interface  inheritance  relations 
provide  a  procedural  means  for  defining  interfaces  between  any  two  macro¬ 
cells:  a  new  interface  between  two  macrocells  can  be  computed  bom  any  legal 
interface  between  a  subcell  in  the  first  macrocell  and  a  subcell  in  the  second. 
As  a  result,  macrocells  can  be  used  to  specify  even  more  complex  cells  in  an 
entirely  procedural  manner  with  no  need  for  additional  layout. 

At  this  stage  of  the  discussion,  all  of  the  RSG’s  functionality  appears  to 
exist  in  other  layout  generators.  For  instance,  procedural  specification  of 
circuit  layouts  is  as  old  as  silicon  compilation  itself,  and  essentially  defines  it. 
The  novelty  of  the  RSG  is  not  its  use  of  procedural  specification,  but  rather 
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the  level  of  abstraction  at  which  it  is  used.  Failure  to  choose  an  optimal 
level  of  abstraction  complicates  the  user  interface,  and  forces  the  designer 
to  concentrate  as  much  on  the  internal  constraints  of  the  generator  as  on 
the  functionality  of  the  circuit  being  designed.  Examples  of  this  are  layout 
generators  that  require  placement  of  cells  by  strict  abutment,  or  that  do  not 
support  true  hierarchical  macro  abstraction. 

The  significant  contribution  of  the  RSG  is  efficiency,  not  computability, 
of  design.  That  is,  the  RSG  does  not  produce  any  circuit  layouts  which, 
given  unlimited  effort,  could  not  be  produced  by  other  layout  generators. 
The  result  of  this  efficiency,  however,  is  a  tool  that  performs  well  in  practice, 
not  just  in  principle,  in  a  realistic  VLSI  design  setting. 

I 

1.2  Comparison  with  other  layout  generators 

1.2.1  Module  generators  and  Silicon  Compilers 

Specialized  VLSI  module  generators  produce  layouts  of  a  particular  ar¬ 
chitecture  to  implement  a  specific  logic  function  such  as  PLAs,  ROMs,  or 
Weinberger  arrays.  These  module  generators  produce  layouts  of  a  specific 
style  of  implementation  in  a  specific  technology.  For  example  a  PLA  gener¬ 
ator  might  generate  PLA’s  with  a  standard  NOR/NOR  architecture,  imple¬ 
mented  with  CMOS  precharged  gates.  Such  specialized  module  generators 
are  capable  of  generating  highly  optimized  layouts  within  the  restricted  class 
they  are  designed  for.  This  is  because  these  generators  can  incorporate  spe¬ 
cific  knowledge  about  the  details  of  their  particular  implementation.  For 
instance  a  PLA  generator  which  incorporates  knowledge  about  the  particu- 


lax  process  technology  and  type  of  circuitry  used  can  be  made  to  size  power 
busses  and  transistors  according  to  some  speed  and  power  criteria.  The  dis¬ 
advantage  of  these  specialized  module  generators  is  that  their  scope  is  limited 
to  the  applicability  of  the  specific  function  they  implement  and  to  the  specific 
process  technology  they  use.  Other  module  generators  such  as  HPLA  also 
generate  a  single  architecture  but  allow  freedom  in  the  implementation  and 
choice  of  technology.  All  of  these  module  generators  take  as  their  input  a 
configuration  specification  (in  the  case  of  a  PLA  this  would  be  the  number 
of  inputs,  outputs,  product  terms  and  the  truth  table)  and  not  a  high  level 
functional  specification,  or  an  architecture  specification  because  functionality 
of  the  output  layout  is  implicit  in  the  single  architecture  they  implement. 

Silicon  compilers  start  with  a  functional  specification  as  their  input.  How¬ 
ever  current  silicon  compilers  are  not  capable  of  determining  and  implement- 
s¬ 
ing  the  optimal  architecture  for  a  given  functional  specification  and  tech¬ 
nology.  These  programs  use  a  single  canonical  architecture  into  which  most 
functional  specifications  can  be  compiled  to  implement  all  functional  specifi¬ 
cations.  Their  success  depends  on  how  well  the  canonical  architecture  they 
use  is  suited  to  the  functional  specification  at  hand.  Macpitts[29]  uses  a  data 
path  implemented  with  registers,  adders,  and  shifters,  and  a  control  path 
implemented  with  a  Weinberger  array  as  the  canonical  architecture.  While 
such  an  architecture  may  be  suited  for  some  applications  it  clearly  is  not 
suited  for  applications  in  signal  processing  which  require  an  efficient  imple¬ 
mentation  for  multiplications.  Hence  even  if  the  program  succeeds  in  keeping 
the  transistor  density  high  by  packing  a  lot  of  circuitry  in  a  small  area,  the 
functional  density  measured  by  how  much  silicon  it  takes  to  implement  a 
given  functionality  is  low.  This  is  due  to  the  inappropriate  implementation 
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Figure  1.2:  Comparison  with  other  layout  generators. 

architecture  where  many  more  transistors  are  required  than  would  be  the 
case  with  a  suitable  architecture.  Early  versions  of  Macpitts  required  about 
5  times  the  area  than  would  be  the  case  for  layouts  generated  by  hand. 

Unlike  specialized  module  generators  and  today’s  silicon  compilers  the 
RSG  can  generate  many  different  architectures  with  just  one  framework.  By 
matching  the  architecture  to  the  functionality  a  level  of  generality  greater 
than  that  of  specialized  generators  can  be  achieved  without  the  loss  of  effi¬ 
ciency  incurred  in  current  silicon  compilers  by  a  mismatched  target  architec¬ 
ture.  Another  big  difference  between  silicon  compilers  and  the  RSG  is  that 
silicon  compilers  start  with  a  function  description  of  the  problem  whereas  the 
RSG  starts  with  user-defined  primitive  cells  and  cell  connectivity  information 
(as  shown  in  Figure  1.1).  Figure  1.2  shows  how  the  RSG  is  moving  toward 
greater  generality  than  specialized  compilers  without  the  loss  of  efficiency 
incurred  in  todays  silicon  compilers. 

1.2.2  RSG  as  a  superset  of  HPLA 

The  RSG  expands  the  scope  of  HPLA  by  allowing  many  different  archi- 
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tectures  to  be  generated  with  tbe  same  benefits  as  in  tbe  case  of  HPLA,  bat 
with  jast  one  framework.  Though  many  of  the  features  of  the  RSG  can  be 
explained  and  justified  independently  of  HPLA,  HPLA  ideas  have  inspired 
and  motivated  the  design  of  the  RSG.  HPLA  does  not  support  many  of  the 
key  features  of  the  RSG  such  as  macro  abstraction,  inheritance  and  macro 
cell  abstraction.  Also  the  algorithms  and  software  techniques  used  in  the 
RSG  are  totally  different  from  those  used  in  HPLA.  HPLA  uses  a  cell  reloca¬ 
tion  scheme  whereas  the  RSG  uses  interfaces  and  an  interface  table.  However 
both  the  RSG  and  HPLA  use  the  idea  that  adjacent  (primitive)  cells  in  the 
final  layout  interface  in  the  same  way  as  they  do  in  the  sample  layout.  Hence 
in  both  programs  the  (primitive)  cell  definitions  and  spacing  parameters  are 
extracted  from  a  sample  layout. 

In  HPLA  the  sample  layout  was  an  actual  assembled  PLA  and  hence  had 
the  same  architecture  as  the  final  layout.  This  constraint  that  the  sample 
layout  be  a  fully  assembled  PLA  is  actually  superfluous.  Using  the  same 
methods  as  those  used  in  HPLA  (i.e.  relocation)  it  is  possible  to  achieve  the 
same  results  from  a  sample  layout  consisting  of  the  PLA  cells  with  the  only 
other  constraint  being  that  all  possible  interfaces  that  might  occur  in  the 
final  layout  be  present  in  the  sample  layout.  The  fact  that  the  sample  layout 
was  a  two  input,  two  output,  two  product  term  PLA  was  simply  a  way  to 
ensure  that  all  the  required  cells  and  interfaces  between  them  be  present  in 
the  sample  layout  because  the  architectural  specification  for  PLAs  is  already 
hard  coded  in  the  HPLA  program  itself  and  is  not  extracted  from  the  sample 
layout. 

In  the  RSG  this  constraint  is  relaxed.  This  not  only  reduces  the  size  and 
complexity  of  the  sample  layout,  but  it  also  allows  the  same  sample  layout 
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to  be  used  in  output  layouts  of  various  different  architectures  because  the 
implicit  architecture  always  present  in  the  sample  layout  does  not  constrain 
the  architecture  of  the  final  layout.  The  sample  layout  in  HPLA  was  actually 
larger  than  necessary  and  contained  redundant  information.  For  example 
the  sample  layout  for  HPLA  contained  2  (identical)  instances  of  the  and-sq 
connect-ao  interface  when  only  one  was  required.  In  so  doing  it  increased 
the  number  of  instances  of  and-sq  and  connect-ao  making  the  sample  layout 
larger  than  necessary.  The  cells  in  many  PLA  sample  layouts  can  also  be  used 
to  generate  other  layouts  besides  PLAs  such  as  decoders  and  multiplexors 
(decoders  can  be  built  from  an  AND  plane  with  appropriate  output  buffers). 
Hence  requiring  that  the  sample  layout  look  like  the  finished  product  is  not 
only  an  unnecessary  restriction  it  also  reduces  the  scope  within  which  any 
given  sample  layout  may  be  used. 

The  method  (relocation)  HPLA  uses  to  generate  new  cells  does  not  eas¬ 
ily  lend  itself  to  cell  hierarchy.  This  did  not  matter  in  HPLA  because  the 
architecture  that  HPLA  generates  (i.e.  the  architecture  for  standard  PLAs) 
does  not  make  use  of  cell  hierarchy.  Making  use  of  cell  hierarchy  entails  gen¬ 
erating  a  macro  cell  from  the  primitive  cells  in  the  sample  layout  replicated 
according  to  some  parameter,  and  then  calling  the  new  macro  cell  in  an  even 
higher  order  cell  several  times  according  to  some  other  parameter.  In  the 
relocation  scheme  the  cell  definitions  for  subcells  of  a  higher  order  cell  are 
actually  modified  to  suit  the  needs  of  the  calling  cell.  This  worked  fine  in 
HPLA  because  there  was  only  one  calling  cell,  i.e.  the  complete  layout  of  the 
PLA.  In  a  scheme  which  uses  hierarchy  there  may  be  many  higher  order  ceils 
(which  can  possibly  be  called  in  even  higher  order  cells),  that  call  the  same 
subcell.  Each  of  these  cells  may  request  that  the  called  subcell  be  modified  in 


some  particular  fashion  to  suit  its  specific  needs.  These  modification  requests 
can  be  conflicting.  One  way  to  solve  the  problem  would  be  to  create  a  copy  of 
the  subcell  for  each  of  the  calling  cells.  Hence  each  calling  cell  can  modify  its 
copy  of  the  subcell  without  conflicting  with  the  modifications  requested  by 
the  other  calling  cells.  The  RSG  however  uses  a  simpler  and  more  powerful 
technique  where  this  problem  does  not  occur. 

1.2.3  The  description  file  verses  the  interface  table. 

Before  HPLA  can  make  a  PLA  horn  a  sample  layout  it  must  first  compile 
the  sample  into  a  special  file  called  the  description  file.  This  description  file 
contains  the  definition  of  all  the.  key  cells  where  the  cell  definitions  have  been 
modified  as  prescribed  by  the  relocation  scheme.  It  also  contains  the  spacing 
parameters  (pitches)  for  the  various  cells.  In  HPLA,  for  the  users  convenience, 
the  process  of  making  a  PLA  is  divided  into  three  parts  each  of  which  occur 
at  different  times  in  the  design  cycle.  This  division  of  the  generation  process 
allows  delayed  binding  of  the  specifics  of  the  PLA  encoding  until  after  the 
PLA  is  fully  installed  into  the  rest  of  a  layout.  The  description  file  is  accessed 
at  each  of  these  three  phases,  hence  it  makes  sense  to  create  the  description 
file  just  once  and  refer  to  it  in  each  of  the  three  phases  of  the  PLA  design. 

In  the  case  of  the  RSG  the  data  structure  corresponding  to  the  description 
file  would  be  the  interface  table.  However  since  the  RSG  produces  the  whole 
layout  all  at  once,  it  does  not  make  sense  to  store  the  data  structure  into  a 
file  and  load  it  back  immediately  into  the  workspace  and  use  it  during  just 
one  session.  Therefore  no  temporary  file  is  created. 

The  RSG  can  generate  any  PLA  that  HPLA  can.  It  can  also  generate 
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more  complex  PLAs  such  as  PLAs  with  folded  rows  or  columns.  However 
in  HPLA  the  division  of  the  generation  process  into  three  parts  facilitates 
recoding  the  PLA  (or  postponing  its  encoding)  and  speeds  up  the  plotting  of 
the  chip  by  leaving  out  the  PLA’s  crosspoints  until  required,  making  HPLA 
a  little  more  convenient  to  use. 

1.3  Thesis  organization 


•  Chapter  2  lays  down  the  mathematical  foundations  of  interfaces ,  the 
method  the  RSG  uses  for  local  placement  constraints. 

•  Chapter  3  gives  the  overall  RSG  algorithm  . 

•  Chapter  4  Describes  the  Language  for  specifying  design  files  and  de¬ 
scribes  in  more  detail  the  specifics  of  the  underlying  data  structures. 

•  Chapter  5  Describes  the  design  of  a  class  of  pipelined  multipliers  using 
the  RSG. 

•  Chapter  6  Is  concerned  with  issues  relating  to  building  a  special  type 
of  compactor  for  use  with  the  RSG. 

Each  chapter  is  organized  so  that  the  first  Sections  lay  down  the  concept  and 
the  foundations  of  the  method  and  the  last  sections  go  into  the  details  of 
some  important  facet  of  the  problem. 
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Chapter  2 


Interfaces 


2.1  Cells  and  Instances 

The  RSG  requires  user-defined  cells  to  hierarchically  build  larger  cells.  A 
cell  A  consists  of  objects  whose  locations  in  the  cell  are  defined  in  terms  of 
a  local  coordinate  system  Ca  with  origin  5„.  The  objects  in  A  can  be  boxes 
of  various  layers,  points,  and  instances  of  other  cells.  An  instance  of  a  cell 
B  is  the  triplet  {cell  definition))  where  L\  is  the  point  of  call  of 

the  cell  B,  Orb  is  the  orientation  in  the  call  of  B  and  ( cell  definition)  is  a 
pointer  to  the  cell  definition  of  B  (the  superscript  r  means  that  the  location 
or  orientation  is  relative  to  a  calling  coordinate  system).  The  effect  of  having 
an  instance  of  B  in  A  with  point  of  call  L\  and  orientation  0\  is  that  of 
performing  the  isometry1  0[  on  B  (0[  is  an  isometry  that  leaves  SI,  the 
origin  of  the  coordinate  system  within  B  unchanged),  placing  the  origin  Sb 
of  B  at  location  L\  within  the  coordinate  system  of  A ,  and  finally  adding  to 


Figure  2.1:  Instance  of  cell  B  in  cell  A. 

A  the  collection  of  objects  in  B  (see  Figure  2.1). 

2.2  Interface  Definition 

A  key  notion  in  the  RSG  is  the  interface.  If  instances  of  cells  A  and  B 
(the  cells  A  and  B  do  not  necessarily  have  to  be  distinct)  are  to  be  called 
within  the  same  coordinate  system,  then  cells  A  and  B  have  an  interface 
between  them.  The  interface  between  two  cells  A  and  B  is  the  ordered  pair 
-  {Vo*,  Oai)  {lab  #  /to)  where  Vab  is  the  interface  vector  and  is  the 
interface  orientation.  is  the  vector  whose  starting  point  is  the  point  of 
call  of  A  and  whose  endpoint  is  the  point  of  call  of  B,  if  the  instance  of  A  is 
held  at  orientation  north  (identity  transform).  Oa»  is  the  orientation  that  B 
would  have  if  the  instance  of  A  were  held  at  orientation  north. 


Treating  the  orientations  as  operators  with  “o”  being  the  operator  com- 


position  rule  we  have3: 


o**  =  (o:r‘°o; 

(2.1) 

v--(03-*(£;-£3 

(2.2) 

The  interface  vector  and  interface  orientation  are  obtained  by 
deskewing  the  relative  orientation  of  B  i.e.  0[  and  the  vector  (L[  -  Lra)  by 
the  inverse  orientation  of  A  (O')-1. 

Figure  2.2(a)  shows  an  instance  of  A  and  an  instance  of  B  called  together 
in  a  same  higher  order  cell  (characterized  in  the  Figure  2.2(a)  by  it’s  coordi¬ 
nate  system  (0,t,;)).  The  point  of  call  L\  (respectively  L[)  of  A  (respectively 
B)  is  the  location  where  the  origin  of  A  (respectively  B)  is  placed  in  the  call¬ 
ing  coordinate  system  (0,i,  j).  In  order  to  obtain  the  interface  /aA  between  A 
and  B  we  must  first  perform  an  isometry  on  the  calling  cell  (the  one  with  the 
(0, :,;)  coordinate  system  in  Figure  2.2(b))  such  that  the  new  orientation 
for  the  instance  of  A  will  be  North.  Since  A  is  initially  oriented  South  the 
calling  cell  must  be  reoriented  by  South’ 1  =  South  (because  180°  =  -180°) 
so  that  A  will  ultimately  be  oriented  North.  Figure  2.2(b)  shows  the  result 
of  the  transformation  of  the  calling  cell.  The  interface  vector  is  now  the  vec¬ 
tor  whose  starting  point  is  at  the  new  point  of  call  A  and  whose  endpoint 
is  at  the  new  point  of  call  of  B.  The  coordinates  of  the  interface  vector  are 
computed  in  terms  of  the  new  basis  (t7,;')  which  is  the  same  as  the  old  basis 
(i,j)  of  the  calling  cell  before  the  transformation  was  performed.  The  inter¬ 
face  orientation  is  now  the  the  new  orientation  of  B  after  the  transformation 
was  performed. 

The  existence  of  an  /<,*  interface  between  A  and  B  automatically  gives 
*0~l  is  defined  by  0~l  o  0  =  0  o  0~l  —  Identity. 
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Figure  2.2:  Interface  between  two  cells. 


rise  to  an  interface  between  B  and  A.  The  expression  for  tbe  /*,  interface 
can  be  obtained  from  equations  2.1  and  2.2. 


Oi.  =  o;'oo. 

=  (0;‘oO,)-‘  (2-3) 

=  O'J 

vta  =  o;‘(L'-u) 

=  (0;‘o(o.cO;1))(L.-Li) 

=  {(Or1  0  0 .)  O  0;‘)(I.  -  £») 

=  (Ou  o  o;')(l.  -  U)  (2-4) 

=  (O-J  =  0;')(L.  -  £.) 

=  .  -Oil(0.-‘(L»  - 1.)) 

=  -Oi!V^ 

Therefore  =  (Vta,Ota)  =  (-Oi'V^Oi*). 


2.3  Advantages  of  using  interfaces 

Interfaces  are  a  natural  way  of  defining  the  relative  placement  and  orien¬ 
tation  between  instances  of  cells.  Hence  knowing  tbe  calTng  information  of 
a  cell  A  in  a  cell  C  and  knowing  tbe  interface  between  A  and  B  it  is  pos¬ 
sible  to  determine  tbe  calling  information  of  B  in  C.  Tbe  RSG  allows  tbe 
user  to  specify  tbe  primitive  cells  and  interfaces  between  them  graphically, 
by  providing  a  layout  file  which  will  henceforth  be  referred  to  as  tbe  sample 
layout.  Tbe  sample  layout  contains  tbe  definitions  of  all  primitive  cells  as 
well  as  interfaces  between  them.  An  interface  between  cells  A  and  B  can  be 
defined  by  calling  A  and  B  together  in  a  higher  order  cell  C  with  the  appro- 


priate  relative  placement  and  orientation  between  them.  In  practice  when 
new  cells  are  created  by  the  layout  designer  they  are  assembled  together  in 
order  to  verify  that  the  different  new  cells  that  have  been  designed,  do  in 
fact  interface  properly  to  each  other.  The  simple  fact  of  assembling  the  cells 
together  requires  calling  them  both  in  one  cell  (same  coordinate  system)  and 
therefore  automatically  defines  an  interface  between  them.  Hence  interfaces 
can  be  designed  at  almost  no  extra  cost  to  the  designer. 

By  virtue  of  the  design- by -example  feature  of  the  RSG,  the  relative  place¬ 
ment  of  neighboring  cells  in  the  final  layout  is  such  that  each  interface  in  the 
final  layout  is  an  instance  of  an  interface  in  the  sample  layout. 

Since  the  relative  placement  of  cells  in  the  final  layout  is  performed  using 
interfaces  between  cells  and  not  by  using  the  sizes  and  shapes  of  the  bounding 
boxes  of  those  cells,  the  cells  can  be  designed  according  to  their  functional 
boundary  constraints  and  without  regard  to  abutment  constraints.  Not  only 
does  this  make  cells  easier  to  design  and  design  rule  check  (because  instances 
of  cells  can  overlap,  each  cell  can  be  made  design  rule  correct3),  the  fact  that 
cells  are  not  cut  at  artificial  boundaries  helps  reduce  the  proliferation  of  cells 
of  essentially  the  same  functionality  but  different  abutment  constraints.  Us¬ 
ing  interfaces  also  allows  cells  to  be  easily  encoded  by  superimposing  several 
cells  in  order  to  modify  the  functionality  of  a  basic  cell.  This  too  helps  in 
reducing  the  proliferation  of  different  cell  types  since  the  number  of  different 
encoding  configurations  is  roughly  exponential  in  the  number  of  independent 
encoding  decisions. 

Cell  encoding  can  also  simplify  the  personalization  process  since  instead 
of  combining  all  the  encoding  decisions  together  to  select  a  single  cell  of  the 

3 Some  hierarchical  design  ml.  checkers  require  that  instances  do  not  overlap. 
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appropriate  type  we  can  use  each  independent  encoding  decision  to  perform 
a  simple  encoding  masking  of  one  basic  cell.  An  encoding  cell  may  lie  well 
within  the  bounding  box  of  the  cell  it  encodes  and  hence  placement  by  abut¬ 
ment  would  be  cumbersome  since  it  would  cause  a  proliferation  of  (spacing) 
cells  that  have  nothing  to  do  with  functionality.  By  simply  specifying  an 
interface  the  relative  orientation  of  the  cells  as  well  as  whether  the  cells  are 
side  by  side,  one  on  top  of  the  other,  or  one  inside  the  other,  is  handled 
automatically. 

2.4  The  Interface  Table 

The  RSG  program  maintains  an  interface  table  of  all  legal  (user  specified) 
interface*  between  cells.  This  table  is  first  initialized  with  interfaces  from  the 
sample  layout  and  can  be  augmented  as  new  cells  are  created  by  the  system. 
Since  there  can  be  several  different  legal  interfaces  between  two  cells  there 
can  be  a  family  of  legal  interfaces  between  two  cells  A  and  B.  Figure  2.3 
shows  two  different  possible  interfaces  for  a  pair  of  cells  A,  B. 

If  the  set  of  legal  interfaces  between  any  two  cells  is  indexed  over  the 
integers  then  the  interface  table  can  be  described  as  a  mapping  from  triplets: 

((cellnamel) ,  (cellnamel) ,  {interface  index  number))  (2.5) 
to  interfaces: 

(( interface  vector) ,  (interface  orientation ))  (2.6) 

If  7,4  is  an  interface  in  the  interface  table ,  then  7^,,  the  corresponding  interface 
between  B  and  A ,  is  also  loaded  in  the  interface  table.  Hence  knowing  the 
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Figure  2.3:  Different  Interfaces  between  two  cells. 


placement  of  A  one  can  determine  the  placement  of  B  and  vice  versa.  This 
bilaterality  of  the  interface  table  is  very  important.  We  will  see  in  section  3.4 
that  it  may  not  be  possible  to  determine  in  advance  which  of  the  two  instances 
A  or  B  has  a  known  placement  and  which  one  will  have  its  placement  derived 
from  the  other. 


2.5  Interface  Inheritance  Relations 


In  order  for  any  cell  to  be  used  in  the  RSG  it  must  have  an  interface  with 
some  other  cell,  otherwise  there  is  no  way  to  place  it.  When  new  cells  are 
built  up  hierarchically  by  the  system,  in  order  to  take  full  advantage  of  cell 
hierarchy,  interfaces  for  new  cells  can  be  specified  in  terms  of  existing  ones. 
In  this  way  cells  built  up  by  the  system  can  be  used  to  build  even  larger  cells 
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Figure  2.4:  Interface  Inheritance 

in  exactly  the  same  fashion  as  were  the  primitive  cells  of  the  sample  layout. 

If  A  (respectively  B)  is  a  subcell  of  a  new  cell  C  (respectively  D),  it  is 
then  possible  to  define  a  new  interface  Id  between  C  and  D  in  terms  of 
an  existing  interface  1^  between  A  and  B.  Id  is  the  interface  that  C  and 
D  would  inherit  if  the  subcells  A  and  B  within  C  and  D  were  placed  and 
oriented  with  interface  U  (see  Figure  2.4).  The  RSG  allows  the  user  to 
define  a  new  interface  (and  load  it  into  the  interface  table)  by  specifying 
the  two  cells  C  and  D,  the  instances  of  A  and  B  in  C  and  D ,  the  interface 
number  of  the  interface  between  A  and  B  and  an  interface  number  for  the 
newly  defined  interface  between  C  and  D. 

The  rest  of  this  section  is  concerned  with  finding  an  algebraic  expression 
for  the  interface  vector  and  interface  orientation  of  the  new  interface  Id 
between  C  and  D  in  terms  of  the  existing  interface  /„*  between  A  and  B  and 
the  calling  parameters  of  the  instances  of  A  and  B  in  C  and  D.  Let4  ,  Oa  ), 
(respectively  {Lfd,  0[d))  the  calling  information  of  A  (respectively  B)  m  C 
(respectively  D)  and  (Va6,Oa4)  (respectively  (Ved,0cd))  be  the  interface  vector 

♦The  superscripts  ,e  (respectively  '*)  mean  that  the  locations  and  orientations  axe  relative 
to  the  coordinate  system  of  C  (respectively  D). 
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and  interface  orientation  of  Iab  (respectively  Icd).  Also  let  L'a  (respectively 
Ll,L',L')  be  tbe  location  of  the  origin  of  A  (respectively  B,C,D)  in  the 
implicit  calling  coordinate  system  (i.e.  as  they  appear  in  Figure  2.4)  and 
let  Ora  (respectively  Ob,  O',  O')  be  the  orientation  of  A  (respectively  B,  C,  D) 
in  the  implicit  calling  coordinate  system  (which  can  be  for  argument  sake 
considered  to  be  the  absolute  coordinate  system)  then: 


and 


Oak 


o+*W)-x 

oroo^^)-1 

So 


a 

II 

o 
*»  1 

o 

(2.7) 

ok 

(2.8) 

0[  =  0'doO? 

(2.9) 

LI  =  L'd  +  0'dL\d 

(2.10) 

2.9  in  2.1  we  get: 

=  (o;)-1  o  oi 

=  [0'coO':)-loO'doo\d 
=  {0':)-l°{0'e)-loO'doOld 
=  io:rio(0'c)-'oo'd 
=  (Ore)-loO'd 

=  Oat 

Ocd  =  o;eooa4o(ci;rf)-1 

(2.11) 

Replacing  equations  2.8  and  2.10  in  equation  2.2  we  get: 


V-  =  (o;)-‘(i;  - 1:) 

h\-L\  =  Oiv^-cwf  +  o;  L? 

(o;)-'(£;-i3  =  ((o«T‘ « O'jv'.,  -  ((o;)-‘  o  ojw  +  (o:)-> » (o;i?) 

Using  equations  2.2  and  2.1  with  different  subscripts,  equation  2.7  and 
the  previous  result  we  get: 


Vcd  = 


ra-‘(£;  -  £?) 

((0,')-‘  o  o:)K»  -  ({o;)-‘  o  Oi)i^  +  «o;)-‘  o  oj)£. 


(2.12) 


=  o':v^-(o:yil?+l 


2.6  An  efficient  representation  for  orientations 


Whereas  interface  vectors  can  be  straightforwardly  represented  by  a  pair 
of  real  numbers,  orientations  require  a  slightly  more  complex  data  structure. 
The  purpose  of  this  section  is  to  find  an  efficient  representation  for  orien¬ 
tations  in  terms  of  memory,  computation  and  ease  of  manipulation.  Recall 
from  Section  2.1  that  calling  an  instance  of  B  in  A  consists  of  performing  an 
affine  isometry  to  the  objects  in  B  and  then  adding  the  collection  of  objects 
in  B  to  A.  A  layout  editor  needs  to  be  able  to  perform  affine  isometries  on 
the  various  cells.  If  A  is  called  in  a  cell  B  which  is  in  turn  called  in  a  higher 
order  cell  C  then  two  affine  isometries  get  applied  to  the  objects  in  A.  The 
first  isometry  Ix  corresponds  to  the  calling  parameters  of  A  in  B  and  the 
second  isometry  Ij  corresponds  to  the  calling  parameters  of  B  in  C.  For  an 
object  06  in  A  the  corresponding  component  in  C  would  be  /j(/i(06)).  Ii  is 
first  performed  on  06  and  then  I2  is  performed  on  the  resulting  object. 

Another  way  to  perform  isometry  composition  is  to  first  compose  the 
two  operators  and  then  apply  the  resulting  operator  to  the  object.  Since 
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h{h[Ob))  —  (/2  o  /l)(06)  it  is  possible  to  first  compute  (J2  o  Ix)  and  then 
apply  this  new  transformation  to  Ob.  This  method  of  first  computing  the 
resulting  isometry  and  then  applying  it  to  the  object  can  be  computationally 
more  efficient  as  the  resulting  isometry  is  computed  only  once  and  hence 
effort  is  not  duplicated  over  the  various  objects  on  which  this  transformation 
is  to  be  performed. 

In  layout  editors  the  preferred  way  of  composing  operators  could  be 
h{h{Ob))  because  this  method  is  easier  to  implement 5.  If  there  is  already  a 
method  for  performing  isometry  on  objects  then,  since  the  result  of  applying 
an  isometry  to  an  object  is  an  object  of  the  same  type  no  extra  mechanism 
is  needed  to  successively  perform  several  isometries  on  the  object.  In  the 
case  where  only  a  finite  set  of  legal  isometries  are  implemented  this  method 
can  lead  to  more  efficient  methods  for  applying  single  isometries  to  objects. 
For  example  one  could  index  the  set  of  available  isometries  over  the  integers. 
In  that  case,  in  order  to  apply  a  isometry  known  by  its  index  number  to  a 
given  object,  one  could  use  the  index  number  to  lookup  a  table  of  procedures 
(there  is  one  procedure  per  isometry)  to  get  the  procedure  that  implements 
that  particular  isometry  and  then  apply  it  to  the  object8.  This  method  elim¬ 
inates  the  interpretive  overhead  associated  with  the  decoding  of  the  isometry 
representation.  For  example  isometries  can  be  represented  as  matrices,  and 
a  program  that  can  apply  any  matrix  transform  to  an  object  would  be  slower 
than  one  that  performs  an  unique  fixed  linear  operation.  However  this  in¬ 
dexed  representation  does  not  lend  itself  to  symbolic  composition.  If  the 
number  of  implemented  indexes  is  n  then  (assuming  that  the  set  of  imple- 

5 However  HPEDIT  uses  the  /j  o  /j  method. 

*HPEDIT  uses  this  method. 
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merited  isometries  is  closed  under  isometry  composition  rules)  knowing  the 
index  of  I j  and  the  index  of  Ii  in  order  to  compute  the  index  of  (Ij  o  /j)  a 
mapping  table  from  n  *  n  to  n  integers  is  required.  Another  table  from  n  to 
n  integers  is  also  required  to  invert  the  isometries  (assuming  the  set  is  closed 
under  inversion).  Hence  this  method  becomes  cumbersome  in  the  case  where 
there  is  a  large  number  of  implemented  isometries.  It  also  requires  a  large 
number  of  procedures;  one  for  each  implemented  isometry. 

In  the  RSG  at  times  it  is  necessary  to  obtain  expressions  for  new  trans- 
formations  and  therefore  operations  for  symbolic  composition  and  inversion 
of  transformations  are  required.  Recall  equations  2.11  and  2.12  from  Sec¬ 
tion  2.5.  In  order  to  compute  the  new  inherited  interface  vector  and  interface 
orientation,  we  need  to  obtain  expressions  for  the  composition  and  inversion 
of  orientations.  It  is  therefore  necessary  to  have  a  representation  for  orienta¬ 
tions  that  allows  them  to  be  easily  applied  as  operators  and  also  allows  them 
to  be  easily  composed  and  inverted. 

One  possible  way  to  implement  all  orientations  is  to  use  2  *  2  matrices  of 
real  numbers.  2*2  matrices  of  real  numbers  can  however  represent  all  the 
different  linear  transformations  in  the  vectorial  plane  out  of  which  isometries 
(which  are  orientations)  are  only  a  very  small  subset.  As  a  result  they  require 
storage  and  manipulation  of  much  more  information  than  is  needed.  Matrix 
composition  and  inversions  are  also  relatively  costly  computationally. 

There  are  more  compact  representations  for  orientations.  We  can  rep¬ 
resent  all  the  vectorial  rotations  in  the  plane  with  a  real  number  between 
[0, 2ir[.  The  rotation  can  be  expressed  by  the  complex  number  e,J  where  j 
belongs  to  [0,2jt[  and  t2  =  -1.  Orientations  are  either  rotations  about  the 
origin  or  reflections  about  an  axis  passing  through  the  origin.  All  the  refiec- 


tions  about  an  axis  passing  though  the  origin  can,  however,  be  generated  by 
composing  the  reflection  about  the  y  axis  (or  any  other  axis  passing  though 
the  origin)  with  a  rotation  about  the  origin.  If  M  is  the  interval  [0, 2ir[  and 
B  is  the  set  of  Booieans,  it  is  then  possible  to  represent  an  orientation  by  the 
pair  (j,  k)  €  M  *  B  where  j  represents  the  rotation,  and  k  indicates  whether 
or  not  a  rotation  about  the  y  axis  is  to  be  performed  before  the  rotation  (the 
composition  of  rotations  and  reflections  is  not  commutative).  If  +  (respec¬ 
tively  — )  is  the  induced  addition  (respectively  subtraction)  modulo  2 ir  from 
M  to  M  and  if  R  is  the  rotation  about  the  y  axis.  Than  any  orientation  can 
be  written  as:  e‘J  o  Rk  where  (j,k)  6  M  *  B  and  t2  =  — 1. 

2.6.1  Inverting  two  orientations. 

Let  O  —  el*  o  Rk 
and  O'1  =  e**'  o  Rk> 

•  If  k  ss  1,  then  O  is  a  reflection.  Therefore  0  o  0  =  /  where  /  is  the 
Identity  transform  and  hence 

0~l  =  O 

=  e‘>  o  M  (2.13) 

=  c'*'  o  M 

so  y  =  j'  and  k  =  kf 

•  If  k  =  0,  then  0  is  a  rotation  and  hence 

O'1  =  e‘»' 

=  ±  (2*14) 

= 


so  =  -j  and  k  —  k' 


Hence  If  Jfe  =  1  then  j  =  j\  k'  —  k  otherwise  j  =  -j\k!  =  k 


2.6.2  Composing  two  orientations 


Ox  =  e*31  o  Rk* 
02  =  e^oR* 
0  =  Oj  o  Oi 

=  e‘3of2* 


(2.15 


Then 


0  =  (e*33  o  f2kl)  o  (e*31  o  Rkl) 
=  e133  o  (jj*3  o  e*31 )  o  Rkl 


(2.16 


because  of  the  associativity  of  linear  operators. 

•  If  k%  =  1  then 

R*1  o  e*31  is  a  reflection  and  hence  (Rki  o  e*31)  o  (Rkt  o  e*31)  =  I  therefore 


Rk>  oe,3»  =  [Rki  a  e*3l)“l 

=  (e'3*)-1  o  (J**»)-1 
=  e'(-«)0  j?*3 
=  («•(”*))  o  (.R*3) 

because  Rkl  is  a  reflection  (or  identity)  and  e'3t  is  a  rotation, 
therefore 
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0  *  e’33  o  (Rkl  o  e*3*  )oRkl 

=  e*33  o  (e^-3^  o  (R*3))  o  R*1 
=  (e'»  oj'Hl)  o  (R*3  o  R*) 
ss  (e’(33-3,))  o  (f?*’®k«) 


(2.18 


where  ©  is  the  XOR  operator, 
hence  j  =  ;2  -  j\  and  k  —  Tel 
•  If  Jfc2  =  0  then 

0  =  e‘«  o  (Ri}  oeijt)oRk* 

-  t'h  o  (e,;i)  o  Rk' 

-  (c‘Ji  o  et}')  o  Rkl 

-  e«(j«+ja)  o/J*« 
hence  j  =  ;2  -f*  ;‘i  and  k  =  ki 
So  Hence  If  fc2  =  1  then  j  —  ;2  —  Ji,  k  —  kx  otherwise  j  —  ji  +  ;2,  k  —  ki 

We  have  seen  that  we  can  represent  an  arbitrary  orientation  (isometry)  by 
the  pair  ( j ,  k)  €  M  *  B  and  .using  the  associativity  of  linear  operators  we  can 
compute  any  expression  involving  composition  and  inversion  of  orientations. 
It  is  computationally  expensive  however  to  apply  an  operator  represented  in 
this  form  to  actual  objects,  because  a  sin  an  a  cos  must  be  computed.  Due 
to  numerical  inaccuracies  an  object  (say  a  box)  with  vertical  and  horizontal 
edges  can  be  transformed  by  a  quarter  turn  rotation  into  a  object  whose  edges 
are  not  precisely  aligned  with  the  axis.  Adding  and  subtracting  elements  of 
M  can  also  lead  to  numerical  inaccuracy  as  elements  of  M  are  represented  in 
the  computer  by  real  numbers  and  a  modulo  2ir  operation  has  to  be  performed 
on  the  result  of  every  real  addition  (or  subtraction)  to  ensure  that  the  result 
is  an  element  of  M . 

In  the  RSG  the  choice  therefore  was  made  not  to  support  arbitrary  ro¬ 
tations  and  reflections.  Most  VLSI  circuit  layouts  are  built  using  boxes  of 
various  layers  where  the  boundaries  of  the  boxes  are  vertical  or  horizontal 
lines  i.e.  parallel  to  one  of  the  coordinate  axis.  Hence  in  most  cases  it  is 
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sufficient  to  support  all  orientations  that  transform  vertical  and  horizontal 
lines  into  vertical  and  horizontal  lines. 


The  four  multiples  of  the  quarter  turn  rotation  are  the  only  rotations  that 
have  this  property.  The  only  reflections  that  can  have  this  property  are  those 
that  transform  vertical  edges  into  vertical  dges  and  horizontal  edges  into 
horizontal  edges  which  are  the  two  reflections  about  the  axis.  And  reflections 
that  transform  vertical  edges  into  horizontal  edges  and  vice  versa  which  are 
the  reflections  about  45  degree  lines  passing  through  the  origin.  These  4 
reflections  can  be  generated  by  first  reflecting  about  the  y  axis  and  then 
applying  one  of  the  four  quarter  turn  rotations.7 

Just  as  arbitrary  orientations  can  be  represented  by  an  element  of  M  *  B, 
these  eight  basic  orientations  can  be  represented  by  z,  an  element  of  £ 
(4I  =  {0ili2,3}),  and  a  boolean  k ,  hence  by  an  element  of  ^  *  B.  This 
would  correspond  to  the  orientation  e?'*  0  Rk  in  the  previous  notation.  Using 
the  induced  addition  and  subtraction  on  ^  the  rules  for  composing  and 
inverting  orientations  are  the  same  as  previously  described  using  the  M  *  B 
representation.  Orientations  can  now  easily  be  applied  to  vectors  and  boxes 
since  performing  a  reflection  about  the  y  axis  corresponds  to  changing  the  z 
coordinate  of  an  object  to  -z.  The  four  quarter  turn  rotations  require  only 
permutations  and  negations  of  the  two  coordinates.  For  instance  the  one 
quarter  turn  rotation  maps  the  x  coordinate  into  the  y  coordinate  and  the 
y  coordinate  into  the  —  z  coordinate.  The  Figure  2.5  shows  the  mapping  of 
coordinates  for  each  of  the  four  basic  rotations. 
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Chapter  3 


The  Algorithm 


3.1  Algorithm  Overview 

The  RSG  algorithm  (see  Figure  3.1)  consists  of  first  reading  in  the  sample 
layout  in  order  to  define  the  primitive  cells  and  build  up  the  initial  interface 
table. 

New  cells  are  then  created  in  a  two  step  sub-algorithm.  The  first  step  in 
the  sub-algorithm  consists  of  building  a  connectivity  graph  for  the  new  cell. 
The  connectivity  graph  for  the  new  cell  is  a  graph  whose  vertices  represent 
partial  instances  whose  cell  type  is  known  but  whose  location  and  orientation 
are  as  yet  unspecified. 

The  edges  between  vertices  represent  interfaces  between  instances  and  the 
weights  assigned  to  them  are  the  interface  index  numbers.  The  connectivity 
graph  need  only  be  a  spanning  tree  since  cycles  iu  the  graph  contain  redundant 
information.  For  a  given  sample  layout ,  each  connectivity  graph  gives  rise 
to  a  unique  layout  (see  Figure  3.2).  Interfaces  provide  the  local  placement 
constraints  between  (two)  cells.  The  connectivity  graph  provides  information 
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Figure  3.1:  RSG  algorithm 

about  the  global  placement  of  all  the  subcells  in  a  macrocell.  The  graph  sets 
up  an  implicit  system  of  linear  equations  whose  unknowns  axe  the  placements 
and  orientations  of  the  (pseudo)  instances  in  the  graph  and  where  the  given 
parameters  are  the  interfaces  between  the  various  cells. 
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Figure  3.2:  Graph  and  Layout  Equivalents 
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The  second  step  consists  of  converting  the  connectivity  graph  into  a  layout. 
This  is  done  by  first  selecting  a  root  node  in  the  graph  and  arbitrarily  placing 
and  orienting  the  corresponding  instance.  The  graph  is  then  traversed,  and 
each  of  the  nodes  in  the  graph  (which  initially  are  all  partial  instances)  gets 
expanded  into  a  complete  instance  with  a  location  and  an  orientation.  The 
location  and  orientation  L\,  and  0&  of  a  partial  instance  B  can  be  computed 
from  the  location  and  orientation  La  and  Oa  of  one  of  its  already  traversed 
neighboring  nodes  A  using  the  formula, 

Oh  =  Oa  o  Oai  (3.1) 


i»  =  OaV.i  +  La  (3.2) 

where  (V^,  Oab)  is  the  interface  between  A  and  B.  Finally  once  a  new  cell 
is  created,  if  it  is  to  be  used  in  a  larger  cell,  it  is  necessary  to  define  new 
interfaces  between  it  and  the  already  existing  cells. 

Since  the  connectivity  graph  need  only  be  a  spanning  tree  many  of  the 
interfaces  that  occur  in  the  final  layout  need  not  be  present  in  the  sample 
layout.  Figure  3.3  shows  a  cluster  of  instances  of  A,B,C  and  D  assembled 
together.  The  corresponding  connectivity  graph  is  also  shown.  The  labels 
inside  the  nodes  of  the  connectivity  graph  correspond  to  the  nodes  a  well 
as  the  instances  they  are  contained  in.  Since  the  connectivity  graph  need 
only  be  a  spanning  tree,  it  does  not  have  to  contain  edges  between  A  and 
D,  A  and  C,  or  B  and  D.  This  is  because  with  or  without  those  edges  the 
graph  remains  a  single  connected  component  (i.e.  one  can  reach  any  node 
starting  from  any  node  by  walking  along  edges  in  the  graph).  Since  the 
three  described  edges  are  not  present  in  the  graph  the  1^  (or  /«*<,),  /<*  (or 
Jc4),  and  /m  (or  /<&)  are  never  accessed  by  the  RSG,  and  therefore  need  not 


Figure  3.3:  Graph  Connectivity  Requirements 

be  present  in  the  sample  layout.  Hence  the  creation  of  both  design  file  and 
sample  file  is  simplified  by  requiring  that  the  graph  be  only  a  spanning  tree. 


3.2  Advantages  of  the  method 

This  (augmented)  two  step  process  of  first  determining  connectivity  and 
then  using  the  connectivity  information  along  with  cell  definition  and  cell 
interface  information  to  build  a  layout,  provides  a  clean  separation  between 
the  graphical  and  procedural  information.  The  procedural  information  in 
the  design  file  is  used  to  build  the  connectivity  graph  and  remains  constant 
over  different  implementations  of  the  design  as  given  by  the  sample  layout. 
The  graphical  information  from  the  sample  layout  is  used  to  transform  the 
connectivity  graph  into  a  physical  layout  of  a  particular  implementation  of  the 
design.  Cell  spacing  parameters  which  relate  to  the  graphical  information  are 
never  accessed  or  manipulated  in  the  design  file.  This  delayed  binding  on  the 
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location  and  orientation  of  instances  allows  for  clean  macro  abstraction  in  the 
design  file.  Since  in  the  design  file ,  partial  instances  are  connected  together 
without  assigning  actual  locations  and  orientations  to  them,  it  is  possible  to 
build  subgraphs  without  prior  knowledge  of  where  and  with  which  orientation 
the  instances  in  the  subgraph  will  be  used.  It  is  easier  and  cleaner  to  write 
and  compose  macros  for  sub-graphs,  because  the  state  of  a  calling  macro  does 
not  side-effect  the  called  macro  by  imposing  a  starting  location  and  a  starting 
orientation  at  which  to  start  assembling  the  subcells  (i.e.  the  called  macro 
returns  the  same  subgraph  regardless  of  how  the  calling  macro  will  choose 
to  connect  the  subgraph  and  regardless  of  the  final  calling  parameters  of  the 
instances  of  the  subgraph).  Macro  abstraction  suppresses  details  of  how  and 
where  a  macro  for  generating  a  subgraph  gets  called  and  allows  the  designer 
to  concentrate  only  on  the  connectivity  of  the  subgraph. 


3.3  Limitations 


The  two  step  process  as  described  in  the  previous  section  provides  a  high 
level  of  separation  between  the  graphical  and  procedural  part  of  the  layout 
process.  Since  geometrical  parameters  are  not  accessed  in  the  design  file , 
however,  decisions  based  on  the  sue  and  shape  of  the  final  layout  such  as 
placement  and  routing  are  difficult  to  make.  For  example  the  choice  between 
the  two  routing  configurations  in  Figure  3.4  requires  knowledge  of  the  sizes 
and  shapes  of  the  two  cells  A  and  B  as  well  as  the  size  of  the  routing  channels. 


40 


Figure  3.4:  Different  routing  configurations 

3.4  Connectivity  Graphs  in  Greater  Detail 

The  purpose  of  this  section  is  to  investigate  some  of  the  properties  of 
connectivity  graphs  both  in  terms  of  data  structures  as  well  as  in  terms  of 
their  mathematical  properties.  The  previous  section  described  an  equivalence 
between  connectivity  graphs  and  physical  layouts.  Actually  (for  a  given  sam¬ 
ple  layout)  to  each  connectivity  graph  there  corresponds  a  whole  equivalence 
class  of  layouts.  All  the  layouts  in  an  equivalence  class  are  such  that  any  ele¬ 
ment  in  the  class  can  be  transformed  into  any  other  element  in  the  class  by  an 
affine  isometry  i.e.  all  elements  in  an  equivalence  class  are  identical  modulo 
an  affine  isometry.  By  selecting  a  root  node  in  the  graph  and  by  placing  and 
orienting  the  corresponding  instance  a  particular  element  in  the  equivalence 
class  is  identified,  namely  the  one  where  the  instance  corresponding  to  the 
root  node  has  the  chosen  placement  and  orientation. 

Connectivity  graph  data  structures  must  have  bilateral  edges.  If  there  is 
an  edge  between  nodes  A  and  B  then  in  the  data  structure  of  A  there  must  be 
a  pointer  to  the  data  structure  of  B  and  in  the  data  structure  of  B  there  must 
be  a  pointer  to  the  data  structure  of  A.  This  is  because  when  a  connectivity 


graph  is  being  created,  the  root  node  of  the  graph  (which  is  arbitrarily  chosen, 
placed  and  oriented)  which  is  the  starting  point  for  traversing  the  graph  (in 
order  to  convert  the  graph  into  a  layout)  may  not  be  known.  Macros  for 
generating  subgraphs  of  a  layout  have  no  knowledge  of  how  the  subgraphs 
they  generate  will  be  connected  together  by  their  calling  macros  in  order  to 
make  larger  graphs.  For  example  if  a  macro  M  for  creating  graphs  were  to 
return  the  subgraph  of  Figure  3.2,  either  node  B  or  node  A  could  be  a  leaf 
node  in  the  graph  (i.e.  a  node  with  only  one  connection  to  it)  depending  on 
whether  node  A  or  node  B  was  connected  to  the  rest  of  the  connectivity  graph 
by  the  macro  that  called  M.  Hence  even  if  the  graph  is  a  spanning  tree  the 
parent- son  relationship  between  directly  connected  nodes  in  the  graph  is  not 
known  until  the  graph  is  traversed.  This  is  why  during  the  graph  traversal 
one  must  be  able  to  get  to  node  B  from  node  .4  and  also  get  to  node  A  from 
node  B  because  we  do  not  know  which  of  the  two  nodes  will  be  visited  first. 

The  bidirectionality  of  the  graph  is  essentially  a  data  structure  problem 
that  is  constrained  only  by  the  graph  traversal  requirements  and  not  by  the 
abstract  mathematical  properties  of  the  graph.  This  requirement  does  not 
constrain  whether  or  not  the  graph  is  directed  or  not.  A  graph  G  =  (JV,  E) 
where  N  is  a  nonempty  set  of  nodes  and  E  is  the  set  of  edges  is  said  to  be 
directed  if  the  edges  are  ordered  pairs  (v,  w)  where  (v,  tu)  €  N  *  N.  That  is 
to  say  there  is  a  privileged  direction  for  the  edges  of  the  graph.  A  graph 
G  -  ({A,  B},  (A,  B))  (a  graph  with  nodes  A  and  B  and  an  edge  from  A  to 
B)  can  have  a  bilateral  data  structure  which  means  that  from  node  A  we  can 
go  to  node  B  and  vice  versa,  and  can  at  the  same  time  be  directed  which 
means  that  the  (A,  B)  edge  has  a  privileged  direction  (i.e.  the  (B,A)  edge 
may  not  exist). 
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Figure  3.5:  Interface  ambiguity  in  undirected  graphs. 

We  now  need  to  decide  whether  or  not  connectivity  graphs  for  the  RSG 
should  be  directed  graphs  or  non-directed  graphs.  What  is  needed  is  a  graph 
that  for  a  given  sample  layout  uniquely  defines  an  output  layout  (modulo  an 
affine  isometry).  If  the  cell  types  of  nodes  A  and  B  are  distinct  then  knowing 
the  locations  and  orientations  of  node  A  it  is  always  possible  to  determine  the 
placement  and  orientation  of  node  B  because  the  right  hand  side  of  equations 
2.1  and  2.2  are  well  defined.  Hence  at  first  it  would  seem  that  an  undirected 
graph  would  suffice.  However,  in  the  case  of  Figure  3.5,  if  we  know  the 
location  and  orientation  of  the  left  node,  there  are  two  possibilities  for  the 
placement  and  orientation  of  the  right  node. 

If  7*4  =  (V^,  Oab)  is  an  interface  between  A  and  B  then  using  equations 
2.1  and  2.2 
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(3.3) 


1 

I 

A.  =  (Vta.Ow.) 

=  (A*)"1 

is  an  interface  between  B  and  A. 

Therefore  if  /<*,  =  (V^,  Oaa)  is  an  interface  between  A  and  A  then 

C  =  (vi.o-j 

-  (/«)-*  (3.4) 

=  (-(0«)-‘V-,(0.)-1) 

is  also  an  interface  between  A  and  A .  In  equation  2.1  and  2.2  it  is  not  clear 
whether  and  0M  or  and  0^  should  appear  on  the  right  hand  side 
of  those  equations.  The  problem  here  is  not  that  of  determining  the  right 
interface  index  (interface  number)  so  as  to  choose  the  right  interface  from 
the  interface  table.  The  real  problem  is  determining  which  instance  the  left 
node  in  Figure  3.5  refers  to.  Another  problem  which  we  will  deal  with  later 
is  that  we  do  not  know  which  of  the  two  interfaces  /«,  or  gets  loaded  into 
the  interface  table.  The  two  interpretations  of  Figure  3.5  can  lead  to  non 
equivalent  layouts  as  shown  in  Figure  3.6.  If  the  edges  are  undirected  then 
there  is  no  way  to  discriminate  between  these  two  cases.  In  the  first  versions 
of  the  RSG  this  problem  caused  the  final  layout  to  depend  on  how  the  graph 
was  actually  traversed.  What  is  needed  is  a  way  of  discriminating  between 
the  two  nodes  of  Figure  3.5  which  are  directly  connected  together  and  have 
the  same  celltype.  This  can  be  done  by  giving  privileged  directions  to  the 
edges  in  the  graph  (making  the  graph  a  directed  graph). 

If  we  are  able  to  characterize  interfaces  according  to  some  criteria  so  as 
to  discriminate  between  the  two  possible  interfaces  /M  and  I and  select  one 
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Figure  3.7:  Resolving  layout  ambiguity  with  a  directed  graph. 


of  them  (which  I  will  refer  to  as  then  with  the  convention  that  if  there 
is  a  directed  edge  in  Figure  3.7  from  A\  to  Aj  (Ai  and  A2  have  the  same 
celltype:  the  indices  are  just  to  distinguish  between  the  two  of  them)  then 
it  is  A\  that  serves  as  the  reference  instance  i.e.  Ai  refers  to  the  instance  in 
the  interface  (see  Figure  3.7)  that  is  deskewed  to  orientation  North  and  at 
whose  point  of  call  the  interface  vector  begins.  Knowing  the  placement  and 
orientation  of  Ai  we  can  determine  the  placement  and  orientation  of  A2  using 
equations  2.1  and  2.2  where  the  interface  knowing  the  placement 

and  orientation  of  A2  we  can  determine  the  placement  and  orientation  of  A\ 
using  the  interface  (i^)'1.  The  main  problem  has  been  to  determine  when 
to  use  (T^a)  and  when  to  use  (i£j_1  and  this  problem  has  been  solved  by 
making  the  edges  of  the  graph  directed. 


The  problem  that  now  remains  to  be  solved  is  that  of  selecting  /£,  from 
Iaa  and  J^,.  One  possible  way  to  perform  the  selection  process  is  to  math¬ 
ematically  characterize  a  property  that  is  possessed  by  only  one  of  the  two 
interfaces  or  /^.  This  property  cannot  depend  on  the  interface  vec- 
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tors  alone  because  it  is  possible  to  have  IM  ^  1'^  with  VM  =  making 
the  selection  between  /„*,  and  using  and  impossible.  Foe  exam* 
pie  if  IM  =  (0,East)  then  —  (Iaa)~1  =  (0,  West)  hence  VM  =  and 
/M  #  Similarly  the  property  cannot  depend  on  the  interface  orientation 
alone  because  it  is  possible  to  have  ^  Aa  =  O^,.  As  an  example 

Let  /M  =  (Voa,  North).  Then  =  (-V^,  North).  Hence  0M  =  0^,  and 

/oa  *  4r 

Since  any  reasonable  mathematical  criterion  for  selecting  between  and 
J^,  depends  on  both  the  interface  vector  and  the  interface  orientation,  chances 
for  finding  a  simple  user  understandable  selection  criteria  are  seriously  jeop¬ 
ardized.  The  user  does  in  fact  need  to  know  which  of  the  two  interfaces  gets 
loaded  into  the  interface  table  ,  because  the  effect  of  loading  (i£j-1  in  the 
table  instead  of  7^,  is  that  of  inverting  the  direction  of  all  the  edges  (with  the 
appropriate  interface  number)  between  nodes  of  celltype  A. 

The  RSG  solves  this  problem  by  allowing  the  user  to  specify  (in  the  sam¬ 
ple  file)  the  right  interface  by  graphically  discriminating  between  the  two 
instances  of  Figure  3.7  (which  might  occur  in  the  sample  file).  If  it  is  pos¬ 
sible  to  graphically  identify  A\  in  the  sample  file  then  it  is  possible  to  force 
„  =  (V£,  0Qa a)  (see  Figure  3.7)  to  be  the  interface  that  gets  loaded  into  the 
interface  table  by  forcing  A\  to  be  the  reference  instance  at  whose  point  of 
call  the  interface  vector  begins  and  whose  orientation  is  deskewed  to  North. 

We  have  seen  that  the  connectivity  graph  data  structure  must  have  bilat¬ 
eral  edges  but  that  the  graph  itself  must  be  directed.  Only  the  edges  between 
nodes  of  the  same  celltype  need  to  be  directed  as  direction  information  on 
edges  between  nodes  of  different  celltype  is  not  used. 


/r 


In  order  to  make  efficient  use  of  the  framework  of  the  RSG  we  must  be 
able  to  build  large  and  complex  connectivity  graphs  easily  and  efficiently.  It 
is  therefore  imperative  that  the  language  for  specifying  design  files  supports 
good  abstraction  and  powerful  decision  making.  The  design  file  interpreter 
has  been  embedded  inside  a  Lisp  interpreter  so  that  the  full  power  of  a  struc¬ 
tured  programming  language  is  available  to  the  designer.  The  interpreter 
provides  a  variant  of  the  Lisp  Programming  Language  (a  subset  of  it)  with 
a  few  special  primitives  for  building  and  manipulating  connectivity  graphs 
as  well  as  for  converting  connectivity  graphs  into  layouts  (a  BNF  grammar 
for  the  language  can  be  found  in  Appendix  A).  Primitives  for  manipulating 
encoding  tables  (such  as  PLA  truth  tables)  have  also  been  added. 

The  design  of  the  language  was  instrumental  in  defining  the  underlying 
mechanisms  in  the  RSG.  It  allowed  me  to  get  a  users  perspective  on  what 
should  be  the  right  abstraction  mechanisms  even  before  I  had  an  understand¬ 
ing  of  how  these  mechanisms  could  be  implemented.  Besides  the  fact  that 
the  language  contains  special  features  specific  to  the  RSG,  the  language  dif- 


fers  from  standard  LISP  (for  example  MACLISP  [27))  in  two  ways.  First  the 
Language  does  not  support  LIST  structures.  Instead  it  provides  primitive 
facilities  for  arrays  because  arrays  are  more  suited  to  array- like  regular  struc¬ 
tures.  Lists  are  not  used  (see  Section  3.4)  to  implement  connectivity  graphs 
since  these  graphs  are  more  than  simple  linked  lists.  The  second  difference 
is  that  procedures  axe  not  first  class  objects.  I.e.  it  is  not  possible  to  pass  a 
procedure  as  a  parameter  to  another  procedure.  This  decision  was  made  to 
simplify  the  design  of  an  efficient  parser  and  interpreter. 


4.1  Interfacing  the  parameter  file  to  the  de¬ 
sign  file 

The  parameter  file  to  design  file  interfacing  is  done  through  variable  scop¬ 
ing  rules.  The  parameter  file  sets  up  parameters  values  in  the  global  envi¬ 
ronment  of  the  design  file  interpreter.  Theses  parameters  can  be  accessed 
through  variable  scoping  rules.  A  form  of  lexical  scoping  proves  to  be  the 
simplest  and  most  efficient  way  to  do  the  scoping.  A  variable  lookup  during 
execution  of  the  design  file  first  causes  that  variable  to  be  searched  for  in  the 
environment  of  the  procedure  being  executed.  If  the  search  fails  a  new  search 
is  then  performed  in  the  global  environment  of  the  interpreter.  Should  this 
search  fail  too  it  is  assumed  that  the  variable  is  a  cell  name  and  a  search  is 
performed  on  the  table  of  available  cells. 

For  example  if  the  variable  corecell  in  Figure  5.4(a)  is  meant  to  refer  to  a 
cell,  since  corecell  is  not  assigned  in  the  environment  (it  is  not  a  formal  or  a 
local  variable  of  the  macro).  The  interpreter  knows  that  it  is  either  a  variable 
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defined  in  the  global  environment  or  a  cell  name  and  initiates  a  search  in  the 
global  environment  and  then  in  the  cell  table.  This  scoping  methodology 
allows  variables  to  be  handled  uniformly  whether  they  are  calling  parameters 
of  the  macro,  parameters  set  up  in  the  parameter  file,  or  just  cells.  Hence  a 
powerful  coupling  between  the  parameter  file  and  the  design  file  is  achieved 
by  immersing  the  design  file  evaluation  in  a  (global)  environment  set  up  by 
the  parameter  file. 

Personalization  of  the  variable  names  in  the  design  file  according  to  the 
cell  names  used  in  a  sample  file  can  also  be  achieved  using  the  parameter 
file  and  scoping  rules.  A  statement  of  the  form  corecell  =  basiccell  in  the 
parameter  file  would  cause  the  variable  named  corecell  in  Figure  5.4  to  now 
refer  to  the  cell  named  basicell  in  the  sample  layout  (or  to  be  more  general 
the  cell  named  basicell  in  the  current  cell  definition  table  which  contains  new 
cells  as  well  as  the  primitive  cells  in  the  sample  layout). 

The  sequence  of  steps  taken  by  the  interpreter  to  evaluate  the  variable 
corecell  during  execution  of  the  design  file  is  summarized  in  table  4.1.  Dynamic 
scoping  was  considered  and  rejected  because  many  of  the  variables  in  a  macro 
refer  to  cell  names  defined  in  the  cell  table  or  variables  defined  in  the  global 
environment  and  often  the  whole  current  chain  of  environments  would  have 
to  be  searched  needlessly. 

4.2  Macros  and  Functions 

In  Lisp  and  other  languages  that  support  procedural  abstraction  a  pro* 
cedure  can  return  a  single  object  (or  a  pointer  to  it).  Connectivity  graphs 
used  in  the  RSG  have  several  nodes  in  them  and  what  can  be  returned  by  a 


Lookup  corecell  in  the  environment  of  mcell 
Lookup  corecell  in  the  global  environment 
Lookup  basiccell  in  the  environment  of  mcell 
Lookup  basiccell  in  the  globed  environment 
Lookup  corecell  in  the  cell  table. 


Failed 

A  variable  named  basiccell 

Faded 

Failed 

(celldehnition  of  basiccell) . 


Figure  4.1:  Environment  lookup. 

procedure  is  a  pointer  to  one  of  them.  A  pointer  to  a  single  node  in  a  sub¬ 
graph,  however,  may  not  be  sufficient  to  efficiently  manipulate  the  subgraph. 
In  the  process  of  building  graphs  from  subgraphs  a  calling  macro  may  need 
to  identify  several  key  nodes  in  the  subgraph  returned  by  the  called  macro 
in  order  to  connect  these  key  nodes  to  nodes  in  other  subgraphs.  Since  all 
nodes  look  alike  except  for  their  celltype  (a  subgraph  may  even  contain  only 
one  celltype)  it  is  extremely  difficult  to  determine  the  nodes  of  interest  (the 
ones  which  are  to  be  connected  to  other  nodes)  by  performing  a  tree  walk 
through  the  graph  (starting  from  the  node  for  which  we  have  a  pointer  to).  In 
the  case  where  the  calling  macro  was  in  fact  sufficiently  smart  to  identify  the 
nodes  of  interest  in  a  subgraph  that  macro  probably  contains  a  large  part  of 
the  information  needed  to  build  the  subgraph,  defeating  the  spirit  of  macro 
abstraction  and  information  hiding. 

A  mechanism  is  needed  whereby  a  macro  can  return  several  objects  at  a 
time.  To  further  enhance  information  hiding  and  at  the  same  time  enhance 
generality  the  calling  macro  should  not  know  how  many  objects  and  how  the 


objects  (in  what  order)  are  returned  by  a  called  macro.  The  calling  macro 
should  be  able  to  pick  from  a  menu  of  available  objects  the  nodes  of  interest 
to  it.  The  way  this  is  achieved  in  the  RSG  is  by  making  macros  return  the 
whole  environment  frame  that  was  used  during  their  execution.  This  method 
provides  great  flexibility  since  any  variable  bound  during  the  execution  of 
the  called  macro  can  be  accessed  using  the  aubcell  command.  The  subcell 
command  provided  by  the  interpreter  allows  the  selection  of  a  particular 
variable  in  a  user-specified  environment.  If  E- is  an  environment  (returned 
by  a  macro)  and  V  is  a  variable  bound  in  that  environment  then  (subcell  E 
V)  returns  the  value  to  which  V  is  bound  in  the  environment  E. 

As  an  example,  in  Figure  5.4(b)  the  4tA  statement  of  macro  mall  assigns 
the  variable  tregs  to  the  object  returned  by  the  macro  call  to  mtopregs. 
Macro  mtopregs  is  assumed  to  create  a  cell  named  topregistername  and 
returns  an  environment  in  which  one  of  the  instances  of  topregistername 
(one  for  which  it  useful  to  get  a  handle  on)  is  bound  to  the  variable  ref. 
Statement  5  of  mall  which  defines  a  new  interface  between  cells  topragis- 
tername  and  arrayname  requires  the  instance  (of  topregistername)  bound 
to  the  variable  ref  in  the  environment  tregs.  The  (subcell  tregs  ref) 
expression  in  statement  5  returns  the  appropriate  instance. 

The  RSG  has  two  classes  of  procedure  types.  The  first  type  are  functions 
which  operate  just  as  in  LISP  and  return  a  single  value  which  is  the  value 
of  the  last  statement  executed  in  the  body  of  the  function.  Their  syntax  is 
almost  identical  to  that  of  MACLISP  (a  variant  of  LISP). 

The  second  class  of  procedure  macros ,  are  identical  to  functions  in  every 
respect  except  that  they  return  their  evaluation  environment  instead  of  the 
value  of  the  last  statement  executed.  Their  syntax  is  the  same  as  for  func- 


<name> 


<cel ldef ini tion> 


.  <objl> 
.  <obj2> 

.  <objn> 


Figure  4.2:  Celldefinition  Data  Structure. 

tions  except  that  the  LISP  function  header  defun  is  replaced  by  macro.  The 
interpreter  also  requires  to  know  ahead  of  time  whether  a  statement  of  the 
form  ((  function  or  macro  name)  (argl)  ..(argn))  is  a  function  call  or  a  macro 
call  and  hence  the  interpreter  requires  that  the  macro  name  begin  with  an 


4.3  Data  Structures 

This  sections  describes  in  detail  the  data  structures  used  in  the  RSG  by 
spelling  out  each- of  them.  Its  purpose  is  to  give  the  reader  a  concrete  feel 
for  implementation  issues  of  the  abstract  data  types  described  in  the  previ¬ 
ous  chapters  and  serves  as  an  introduction  to  the  next  section.  Three  data 
structures;  the  cell  definition ,  the  instance  and  the  node  will  be  examined. 

Figure  4.2  shows  the  cell  definition  data  structure  which  consists  of  a  name 
(the  name  of  the  cell)  and  list  of  objects  in  the  cell. 

Figure  4.3  shows  how  the  instance  data  structure  builds  on  the  cell  defi¬ 
nition  data  structure  by  adding  calling  parameters  (a  location  and  an  orien¬ 
tation  )  to  it. 

Figure  4.4  shows  how  the  node  data  structure  is  in  turn  built  from  the 
instance  and  a  list  of  edges  to  other  nodes.  The  location  and  orientation 


<location> 


<instance> 


.  <orientation> 

,  y  <ce11def 1n1t1on> 


<node> 


Figure  4.3:  Instance  Data  Structure. 
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Figure  4.4:  Node  Data  Structure. 


fields  of  the  corresponding  instance  data  structure  may  or  may  not  be  blank 
depending  on  whether  or  not  the  graph  (which  contains  the  node)  has  been 
traversed.  Each  edge  in  the  edge  list  of  the  node  has  a  bit  to  indicate  whether 
the  edge  is  emanating  or  terminating  at  the  current  node,  an  integer  for  the 
weight  of  the  edge,  and  a  pointer  to  the  other  node  attached  to  the  edge1. 


4.4  Primitive  operators  for  connectivity  graphs 


This  section  describes  mk. instance,  connect  and  mk-cell  the  three  primi¬ 
tive  operators  provided  in  the  RSG  for  building  and  manipulating  connectiv¬ 
ity  graphs.  Mutation  of  the  data  structures  described  in  the  previous  section 
under  these  operators  is  also  shown. 

4.4.1  mkJnstance  operator 

The  basic  create  operator  for  creating  connectivity  graphs  is  the  mkJnstance 
operator.  The  purpose  of  this  operator  is  to  create  a  pseudo  instance  connec¬ 
tivity  graph  node  (the  node  data  structure  of  the  previous  section).  Figure  4.5 
shows  in  large  font  (the  top  line)  a  call  to  the  mk.instance  operator  as  it  would 
appear  in  the  design  file.  The  data  structures  before  the  operator  is  executed 
appear  in  unbroken  line  and  in  normal  font.  The  data  structures  created  or 
modified  after  the  operator  is  executed  appear  in  broken  line  and  in  italics. 
The  edge  list  of  the  created  node  is  the  empty  set  and  the  fields  for  the  call¬ 
ing  parameters  of  the  corresponding  instance  are  blank.  ( return  value)  is  the 
value  for  the  calling  expression  (the  top  line  in  Figure  4.5)  that  is  returned 
by  the  design  file  interpreter. 

4.4.2  connect  operator 

The  primitive  operator  for  connecting  two  nodes  together  by  an  edge  is 
the  connect  operator.  Figure  4.6  shows  the  effect  of  the  connect  statement 
with  the  same  conventions  as  in  Figure  4.5.  Notice  that  the  edge  of  the  node 
corresponding  to  ( argl )  (pointing  to  ( arg2 ))  has  a  1  as  its  direction  bit  which 
means  that  the  edge  emanates  from  {argl).  Similarly  the  corresponding  edge 
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Figure  4.S:  mkJnstance  operator. 

in  (arg 2)  has  0  as  its  direction  bit  which  means  that  the  edge  terminates  at 
(arg2). 


4.4.3  mk.cell  operator 

The  primitive  operator  for  traversing  and  transforming  a  connectivity 
graph  into  a  layout  is  mkjcell,  Figure  4.7  shows  the  effect  of  calling  the 
mkjcell  operator  in  a  design  file.  For  simplicity  sake  nodes  have  been  rep¬ 
resented  by  circles  instead  of  expanding  their  internal  data  structures.  Each 
of  the  nodes  has  a  pointer  to  the  instance  to  which  they  correspond  to.  The 
calling  parameters  of  the  instances  are  initially  blank  and  are  filled  in  as 
the  graph  is  traversed.  The  root  of  the  graph  is  the  node  <  arg 2  >  and 
its  instance  is  called  at  ((0,0), North).  As  each  new  node  is  visited  and 
its  instance’s  calling  parameters  are  filled  in,  a  pointer  to  the  completed  in¬ 
stance  is  pushed  on  the  list  of  objects  of  the  new  cell  being  built.  When  the 
graph  traversal  is  complete  the  object  list  of  the  cell  definition  of  the  new 
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Figure  4.6:  connect  operator. 

cell  contains  a  pointer  to  all  the  instances.  Not  shown  in  the  figure  is  the 
update  of  the  cell  definition  table  which  after  execution  contains  the  binding 
[(new  cell  name),  (new  cell  definition)). 


4.5  Implementation 

Implementation  of  the  RSG  was  rather  straightforward.  Roughly  two 
thirds  of  the  code  was  overhead.  Building  and  maintaining  the  layout  database 
represents  a  suable  portion  of  the  code.  The  single  largest  part  of  the  code 
however  is  the  design  file  interpreter  which  parses  the  design  file  (and  pa¬ 
rameter  file)  and  then  executes  the  commands  in  it.  Writing  a  reasonable 
design  file  parser  and  interpreter  was  also  the  most  time  consuming  task  as 


(mk_ce11  <argl>  <arg2>) 


Figure  4.7:  mk.cell  operator. 


the  language  supports  full  recursion,  reasonable  error  handling  and  high  ex* 
ecution  speed.  Embedding  the  RSG  in  a  VLSI  database  type  system  such 
as  Magic  [26]  or  Schema  [32]  would  have  drastically  reduced  this  overhead. 
Furthermore  the  availability  of  a  suitable  parser  and  interpreter  which  could 
support  macros  and  functions  (as  they  are  described  in  Section  4.2)  would 
have  reduced  the  code  by  perhaps  one  half.  In  order  to  embed  the  RSG  in  a 
VLSI  database  type  scheme,  such  as  the  two  systems  described  above,  facili¬ 
ties  must  be  provided  to  create  the  design  file  language  by  performing  minor 
alterations  to  a  standard  programming  language  such  as  LISP  from  where 
the  whole  layout  database  could  be  accessible. 

The  RSG  program  is  written  in  CLU  [21]  and  consists  of  approximately 
6000  lines  of  source  code.  The  program  is  highly  modularized  and  consists  of 
roughly  a  dozen  major  parts  (CLU  clusters),  one  for  each  major  data  type. 
The  code  trades  memory  for  greater  execution  speed.  The  interpreter  makes 
extensive  use  of  CLU  variants2  and  hence  reduces  the  design  file  instruction 
decode  overhead.  The  interface  table ,  the  cell  definition  table  and  even  the 
interpreter  environment  frames  are  all  implemented  with  hash  tables  [l] 
which  makes  lookup  extremely  fast.  While  walking  though  a  connectivity 
graph  the  system  accesses  the  interface  table  once  for  each  node  hence  it  is 
imperative  that  interface  lookup  be  fast.  While  building  large  array  struc¬ 
tures  the  graph  may  be  built  by  a  tight  loop  in  one  of  the  design  file  macros. 
At  each  loop  all  the  variables  have  to  be  resolved  by  the  interpreter.  Also  due 
to  the  scoping  rules  described  in  Section  4.1  several  environments  (and  the 
cell  definition  table)  may  have  to  be  looked  up  to  resolve  a  variable  binding 

3  A  variant  is  an  object  which  has  a  special  tag.  Program  flow  can  be  dispatched  according 
to  this  tag. 
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(especially  since  variables  often  refer  to  cells  like  in  Table  4.1).  It  is  there* 
fore  imperative  that  variable  lookup  also  be  extremely  fast.  Hash  tables  have 
the  unfortunate  property  of  consuming  a  lot  of  memory  (memory  concerns 
will  become  clearer  in  the  next  paragraph)  and  becoming  inefficient  as  the 
number  of  bindings  grows  beyond  their  individual  capacity  which  is  fixed  at 
the  time  they  are  created.  Care  must  be  taken  while  creating  these  tables  to 
make  them  large  enough  to  handle  the  required  number  of  bindings  but  not 
too  large  in  order  not  to  waste  too  much  memory. 

The  design  file  interpreter  which  uses  hash  tables  to  implement  environ* 
ments  pays  particular  attention  to  this  by  first  computing  the  number  of 
formal  and  local  parameters  in  a  called  procedure  and  then  accordingly  allo¬ 
cating  a  hash  table  of  the  right  size  for  the  environment.  Unlike  a  classical 
LISP  interpreter  which  disposes  of  the  environment  frame  when  a  procedure  is 
exited,  environments  in  design  files  may  have  a  much  greater  lifetime.  Macros 
return  their  calling  environment.  This  environment  may  in  turn  be  held  on  to 
by  the  calling  macro  in  its  own  environment.  This  environment  may  in  turn 
be  retained  by  an  even  higher  order  macro.  It  is  possible  to  write  a  design 
file  which  holds  on  to  too  many  environments  (several  thousand)  at  a  time 
and  exhausts  the  memory  of  a  DEC- 20.  On  the  VAX  this  problem  shows  up 
in  the  form  of  a  substantial  decrease  in  speed  due  to  excessive  page  faults. 
However  it  is  almost  always  possible  to  decrease  the  memory  requirements 
(by  orders  of  magnitude)  to  within  manageable  limits  by  writing  the  design 
file  in  such  a  way  so  as  not  to  hold  on  to  many  unneeded  environments. 

The  RSG  maintains  it’s  own  database  and  as  such  it  is  layout  file  format 
independent.  The  RSG  can  be  made  to  accept  any  file  format  by  providing 
an  appropriate  parser  for  the  file  format  (this  procedure  requires  that  the 


code  be  recompiled).  The  user  can  in  the  parameter  file  select  the  layout  file 
format  from  a  list  of  available  file  formats.  Two  layout  file  formats  (CIF  [25] 
and  DEF  [2])  are  supported.  Plans  for  supporting  HPDRAW  [3]  files  are 
also  under  way.  Primitive  functions  can  easily  be  added  to  the  design  file 
interpreter  provided  they  fulfill  some  input  output  requirements. 

The  execution  time  is  divided  into  roughly  three  equal  parts:  reading  in 
the  source  file  and  building  up  the  initial  interface  tablet  parsing  and  executing 
the  design  and  parameter  file ,  and  writing  the  output  file.  A  32  x  32  Baugh* 
Wooley  multiplier  as  discussed  in  Chapter5  is  generated  in  5  seconds  on  a 
DEC-2060. 

The  basic  RSG  mechanisms  can  be  easily  implemented  in  any  language 
that  supports  good  primitives  for  manipulating  pointers  and  heaps  (Pascal,  C 
and  Lisp  would  be  suitable  candidates).  Memory  management  for  the  design 
file  interpreter  (a  variant  of  Lisp)  which  supports  heap  storage  and  garbage 
collection  is  automatically  handled  by  the  underlying  CL U3  runtime  system. 
Implementing  the  interpreter  in  a  language  which  does  not  support  automatic 
garbage  collection  might  require  restricting  the  power  of  the  design  file  inter¬ 
preter  or  implementing  some  form  of  automatic  garbage  collection.  Lexically 
scoped  Lisp  with  some  primitive  mechanisms  for  manipulating  arrays  would 
be  very  suitable  as  many  of  the  primitive  operators  provided  by  the  design 
file  interpreter  are  also  Lisp  primitives.  The  Lisp  closure  mechanism  could 
perhaps  be  used  to  implement  the  macro4  mechanism  in  the  RSG. 


3CLU  supports  heap  storage  and  garbage  collection. 

4 Recall  from  Section  4.2  that  macros  return  their  environment. 
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Chapter  5 


Example:  Pipelined  Array 
Multipliers 


A  pipelined  array  multiplier  provides  a  good  illustration  of  the  RSG’s 
ability  to  generate  layouts  for  the  kind  of  nontrivial  regular  structures  that 
typically  arise  in  practice.  Figure  5.1  shows  a  purely  combinational  6x6 
signed  two’s  complement  multiplier  based  on  the  Baugh* Wooley  algorithm 
[I3j.  The  multiplier  consists  of  an  array  of  two  types  of  carry-save  adders 
that  reduce  the  product  to  the  sum  of  two  words,  which  are  then  added  in 
a  final  row  of  cells  connected  as  a  carry-propagate  adder.  (The  two  diagonal 
connections  have  been  condensed  to  one  for  clarity).  Each  cell  type  contains 
an  AND  gate  and  a  full  adder:  cell  type  I  adds  the  bit-product  o,6,  to  its  sum 
and  carry  inputs;  and  cell  type  II  adds  to  its  sum  and  carry  inputs.  The 
carry-propagate  adder  consists  of  type  I  cells  which  are  drawn  as  polygons  to 
distinguish  them  from  the  carry-save  cells. 

Using  retiming  transformations  [18],  the  multiplier  can  be  pipelined  to 
any  degree  in  a  manner  that  preserves  the  regularity  of  the  inner  array,  but 


Figure  5.1:  Combinational  Baugh- Wooley  Multiplier 


adds  irregularity  to  the  periphery  of  the  array  in  the  form  of  input  and  output 
register  stacks.  Figure  5.2  illustrates  two  pipelined  versions  of  the  multiplier. 
(An  integer  near  a  dot  represents  the  number  of  registers  on  the  corresponding 
connection).  The  first  version  (2a)  is  a  bit-systolic  multiplier  that  has  at  most 
one  full  adder  combinational  delay  between  any  two  registers,  and  represents 
the  highest  possible  degree  of  pipelining  given  the  choice  of  the  full  adder 
as  the  largest  indivisible  cell.  The  second  version  (2b)  implements  a  lower 
degree  of  pipelining,  allowing  at  most  two  combinational  delays  between  any 
pair  of  registers.  From  a  circuit  perspective,  the  optimal  degree  of  pipelining 
is  application  and  technology  dependent,  so  it  is  necessary  to  be  able  to 
automatically  generate  any  degree  of  pipelining. 

A  pipelined  multiplier  of  given  size  and  level  of  pipelining  can  be  con¬ 
structed  by  personalizing  an  array  of  basic  cells  which  has  been  sized  accord- 


Figure  5.2:  (a)  Bit-Systolic  Multiplier;  (b)  Pipelined  Multiplier 

ing  to  the  number  of  bits  in  the  multiplier  and  multiplicand.  Each  cell  in 
the  array  most  be  personalised  with  respect  to  each  of  the  following  options 
depicted  in  Figures  5.1  and  5.2: 

'  1.  Cell  type:  Each  cell  must  be  programmed  as  either  type  I  or  type  II  to 
correctly  implement  the  signed  two’s  complement  algorithm.  Type  II 
cells  occur  on  the  left  and  bottom  edges  of  the  carry-save  array,  except 
for  the  cell  at  the  lower  left  comer.  All  remaining  locations  require  cell 
type  I. 

2.  Cell  interface:  To  obtain  nearly  identical  circuit  topologies,  cell  types 
I  and  II  use  different  active  input  levels.  Furthermore,  active  output 
levels  are  affected  by  the  amount  of  pipelining.  Therefore,  each  cell 
interface  is  determined  by  the  type  of  cells  being  connected  and  the 
number  of  registers  on  the  connection. 


3.  Register  assignment:  The  placement  of  registers  on  connections  be¬ 
tween  cells  depends  on  the  degree  of  pipelining  and  the  locations  of  the 
cells  being  connected. 

4.  Clock  assignment:  Pipelined  systems  generally  require  several  clocks 
which  must  be  assigned  to  registers  according  to  their  location  in  the 
array.  Clock  assignment  is  further  complicated  by  the  need  to  em¬ 
ploy  such  circuit  techniques  as  precharging  to  reduce  area  and  power 
requirements. 

In  addition  to  the  internal  array  configuration,  there  are  “edge  effects”  to 
consider  as  well: 

1.  Peripheral  registers:  In  order  to  properly  skew  the  inputs  and  deskew 
the  outputs,  registers  must  be  placed  along  the  periphery  as  determined 
by  the  retiming  transformations. 

2.  Input  assignment:  Ones  and  zeros  must  be  assigned  to  the  unused 
inputs  along  the  top  and  left  edges  as  prescribed  by  the  Baugh- Wooley 
algorithm. 

Cell  masking  is  used  extensively  to  convert  an  array  personalization  to 
actual  layout.  A  basic  cell  is  created  which  contains  the  layout  features 
common  to  all  cell  personalities  and  which  can  accommodate  the  variations  in 
layout  necessary  to  implement  all  design  options.  Mask  cells  are  instantiated 
on  the  basic  cell  to  activate  particular  options  by  adding  objects  to  the  various 
layers.  Figure  5.3  illustrates  this  with  a  basic  cell  designed  to  specifically 
optimize  the  electrical  performance  of  the  bit-systolic  multiplier  of  Figure 
5.2a.  This  cell  contains  input  inverters,  full  adder  circuitry,  and  six  output 
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Figure  5.3:  Multiplier  Cell  Maskings 


registers.  In  this  example,  the  basic -cell  is  programmed  to  type  I  by  the 
mask-cell  typel,  its  carry  input  inverter  is  programmed  by  mask-cell  carl 
to  interface  with  a  type  U  cell,  and  it  is  assigned  the  clock  4>\  by  mask-cells 
phil-1,  phil-2,  phil-3,  and  phi  1-4.  The  inner  array  of  the  multiplier  is 
built  up  one  cell  at  a  time  by  first  personalizing  a  copy  of  basic-cell,  and 
then  adding  it  to  the  array.  Then  the  multiplier  is  completed  by  adding 
registers  to  the  periphery  of  the  array. 

Figure  5.4  shows  two  sections  of  the  design  file  written  to  generate  a  bit- 
systolic  multiplier  for  any  m-by-n  case,  and  demonstrates  the  use  of  macro 
abstraction ,  delayed  binding ,  and  interface  inheritance.  The  me  ell  macro 
of  Figure  5.4a  executes  the  personalization  of  basic -cell  as  a  function  of 
array  size  and  cell  index,  and  is  used  to  hierarchically  build  the  macrocell 
innerarray  (the  inner  array  of  the  multiplier).  Delayed  binding  on  the  abso¬ 
lute  location  of  each  personalized  cell  greatly  simplifies  the  definition  and  use 
of  mcell  in  the  creation  of  larger  macrocells  like  innerarray.  The  code  in 
Figure  5.4b  constructs  the  complete  multiplier  from  innerarray  and  three 
boundary  macrocells ,  tregs,  rregs,  and  bregs,  which  are  constructed  from 


a  single  register  cell.  The  three  boundary  cells  are  connected  to  innerarray 
using  interfaces  that  are  inherited  from  an  interface  between  the  basic  cell 
and  register  cell.  This  example  is  cited  to  emphasize  that  macrocell »  can  be 
manipulated  with  absolutely  no  need  to  enter  the  graphics  domain  and  man¬ 
ually  define  interfaces  or  add  spacing  cells,  as  required  by  layout  generators 
with  restricted  powers  of  abstraction. 

The  input  layout  file  in  Figure  5.5  demonstrates  the  ease  and  generality 
with  which  cell  interfaces  are  specified  in  the  RSG.  One  merely  provides  an  ex¬ 
ample  of  the  interface,  and  places  a  numerical  label  in  the  overlapping  region, 
as  for  example,  interface  number  1  (the  only  interface)  between  basic-cell 
and  typel.  The  RSG  then  creates  an  interface  vector  and  orientation  from 
this  graphical  specification,  and  uses  it  to  implement  all  instances  of  this 
interface  that  occur  in  the  final  circuit  layout.  The  layout  file  provides  a  nat¬ 
ural  means  for  the  user  specification  of  cell  layouts  and  interfaces  and  greatly 
reduces  the  amount  of  redundant  information  needed  to  characterize  regular 
circuit  layouts.  This  can  be  appreciated  by  comparing  Figure  5.5  with  the 
6x6  systolic  multiplier  layout  shown  in  Figure  5.6.  This  layout  also  illus¬ 
trates  the  amount  of  complexity  that  exists  in  practical  regular  structures, 
even  though  this  design  has  been  simplified  by  omitting  the  register  mask¬ 
ing  option.  Register  placement  can  be  easily  achieved  by  requiring  that  the 
user  provide  a  register  configuration  table  in  the  parameter  file.  Ultimately 
a  subprogram  to  perform  the  retiming  can  be  embedded  in  the  multiplier  de¬ 
sign  file.  The  program  would  take  as  input  the  parameter  /?  which  specifies 
the  degree  of  pipelining  and  produce  as  output  a  register  configuration  table 
consistent  with  the  multiplier  size. 

The  optimum  /?  for  circuit  performance  within  this  class  of  pipelined  mul- 
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(macro  mcoll  (xslzo  ysiza  xloc  yloc) 

(locals  c  tamp) 

(mk.instanca  c  baslccall) 

(cond  ((«  (♦  ysiza  1)  yloc)  (connoct  c  (mk.instanca  tamp  typel)  tllnum)) 

((•  xslza  xloc)  (cond  ((«  ysiza  yloc)  (connact  c  (mk.instanca  tamp  typal)  tllnum)) 
(true  (connact  c  (mk.instanca  tamp  typa2)  t21num)))) 

(trua  (cond  ((»  ysiza  yloc)  (connact  c  (mk_instanco  tamp  type2)  t21num)) 

(trua  (connact  c  (mk.instanca  tamp  typal)  tllnum))))) 

(cond  ((*  (mod  xloc  2)  0) 

(prog  (connoct  c  (mk.instanca  tamp  phll.l)  clkllnum) 

(connect  c  (mk.lnstanco  tamp  phi  1.2)  clkllnum) 

(connect  c  (mk.lnstanco  tamp  phll.3)  clkllnum) 

(connect  c  (mk.lnstanco  tamp  phil.4)  clkllnum))) 

(true 

(prog  (connact  c  (mk.lnstanco  tamp  ph12_l)  dk21num) 

(connect  c  (mk.lnstanco  tamp  phi2_2)  clk21num) 

(connect  c  (mk.instanca  tamp  ph i 2.3 )  dk21num) 

(connect  c  (mk.instanca  tamp  piti2.4)  clk21num)))) 

(cond  ((*  yloc  ysiza)  (connact  c  (mk_instance  temp  car2)  car21num)) 

((»  yloc  (+  ysiza  1)) 

(cond  ((■  xloc  xslza)  (connact  c  (mk.instanca  tamp  earl)  carLInum)) 

(true  (connoct  c  (mk.instanca  tamp  car2)  car2inum))) 

(trua  (connect  c  (mk.instanca  tamp  earl)  carlinum))))) 


(a)  Call  personalization 


(macro  mall  (xslza  ysiza) 

(locals  Innerarray  trags  brags  rrags  trl  arrayl  brl  rrl) 

(satq  rrags  (mrlghtrags  ysiza)) 

(satq  brags  (mbottomrags  xslza)) 

(satq  innerarray  (marray  xslza  ysiza}) 

(satq  trags  (mtoprags  xslza)) 

(doclara.lntorfaco  topreglstername  arraynama  1 

(subcoll  trags  raf)  (subcall  Innerarray  toprlght) 
cal l.to.toproglnum) 

(connact  (mk.instanca  trl  topragistarnama)  (mk.instanca  arrayl  arraynama)  1) 
(daclaro.intarfaca  arraynama  bottomreglstername  1 

(subcoll  innerarray  bottomrlght)  (subcall  brags  raf) 
cal  1 .to.bo  t tomrog i n  urn ) 

(connact  (mk.lnstanco  brl  bottomragisternama)  arrayl  1) 

(daclara.lntarface  arraynama  rlghtroglstorname  1 

(subcoll  innerarray  toprlght)  (subcall  rrags  raf) 
call.to.rlghtraglnum) 

(connact  (mk.instanca  rrl  rlghtreglstername)  arrayl  1) 

(mk.coll  "tha.aholo.thlng*  arrayl)) 
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Figure  5.5:  Layout  File  for  a  Systolic  Multiplier 


tipliers  must  be  determined  empirically  through  repeated  iterations  of  mul¬ 
tiplier  layout  generation,  circuit  extraction,  and  electrical  simulation.  The 
structure  of  these  pipelined  multipliers  facilitates  such  an  empirical  investi¬ 
gation  by  admitting  very  regular  layouts  that  can  be  generated  quickly  and 
interactively  by  the  RSG.  A  study  of  the  circuit  issues  determining  pipelined 
array  multiplier  performance [12]  is  now  underway  using  the  RSG  for  layout 
generation,  EXCL  [23]  for  circuit  extraction,  and  SPICE  [30]  for  circuit  simu¬ 
lation.  Preliminary  simulations  suggest  that  clock  drive,  clock  skew,  and  I/O 
pad  drive  —  all  of  which  vary  with  the  level  of  pipelining  and  multiplier  sise 
—  will  be  the  primary  limitations  to  throughput.  For  large  multiplier  sizes, 
macromodeling  of  critical  paths  can  be  used  to  alleviate  the  computational 
requirements  of  SPICE. 


69 


Chapter  6 


Compaction 


6.1  Motivation 

Despite  the  fact  that  the  RSG  is  technology,  implementation  and  archi¬ 
tecture  independent,  the  RSG  by  itself  is  not  technology  transportable  (The 
RSG  cannot  be  made  to  produce  designs  in  a  new  technology  simply  by  pro¬ 
viding  a  new  design  rule  file).  A  library  of  cells  for  the  RSG  designed  in 
an  older  technology  can  quickly  become  obsolete  as  new  process  technologies 
with  smaller  geometries  become  available.  Another  problem  with  the  RSG 
is  that  highly  electrically  optimized  layouts  require  fine  tuned  optimization 
of  the  bus  and  device  sizes.  These  optimizations  depend  on  the  particular 
configuration  (size)  of  the  final  layout.  Therefore  cells  designed  for  small  con¬ 
figurations  may  not  be  suited  for  larger  ones  which  might  require  larger  buses 
and  larger  transistors  to  drive  them.  Since  the  RSG  cannot  modify  the  prim¬ 
itive  cells  specified  in  the  sample  file  one  solution  to  the  layout  optimization 
problem  would  be  to  design  several  cells  for  each  functionality  where  each 
cell  is  designed  for  a  different  configuration  range.  For  example  one  might 


design  three  different  input  buffers  for  a  PLA.  One  type  of  buffer  would  be 
designed  for  use  in  PLAs  with  a  large  number  of  product  terms,  another  for 
use  in  PLAs  with  an  average  number  of  product  terms  and  one  for  use  in 
PLAs  with  a  small  number  of  product  terms.  This  method  of  choosing  the 
right  set  of  primitive  cells  according  to  the  replication  factors  ,  requires  the 
substantial  layout  investment  of  having  to  design  a  large  number  of  cells. 
Also  the  method  lends  itself  to  only  a  coarse  grained  optimization  due  to  the 
approximation  of  the  electrical  optimization  requirements  by  one  of  the  cells 
already  defined  in  the  library.  The  appropriate  device  sizes  given  some  speed 
and  power  constraints  could  be  derived  from  Macromodeling  Optimization 
techniques  [22]. 

The  problem  of  making  the  RSG  technology  transportable  and  allowing 
generation  of  electrically  optimized  layouts  could  be  achieved  by  using  a  spe¬ 
cial  kind  of  compactor  which  I  will  refer  to  as  a  leaf  cell  compactor.  I  believe 
that  this  kind  of  compactor  has  not  yet  been  seriously  investigated  because 
of  the  significant  difficulties  encountered  in  straightforward  compaction,  and 
also  because  the  usefulness  of  this  kind  of  compactor  is  closely  related  to 
an  RSG  type  design  methodology  whose  benefits  have  only  recently  been 
established. 

A  leaf  cell  compactor  is  a  compactor  capable  of  compacting  cells  from  a 
library  while  taking  into  account  how  the  cells  in  the  library  may  potentially 
interface  together.  For  example  if  cells  A  and  B  can  potentially  interface 
as  in  Figure  2.3  then  while  compacting  cell  A  we  have  to  take  into  account 
the  constraints  generated  by  its  connection  to  B.  If  cell  B  cannot  be  com¬ 
pacted  further  then  it  is  possible  that  due  to  the  constraints  between  A  and 
B ,  A  cannot  be  compacted  further  although  A  if  compacted  by  itself  on  a 


classical  compactor  could  stand  to  be  further  compacted.  Context  sensitive 
compaction  is  different  (probably  simpler)  than  hierarchical  compaction  [8] 
which  starts  with  a  complete  final  layout  but  does  the  compaction  hierarchi¬ 
cally. 

The  advantages  of  a  leaf  cell  compactor  are  that  by  compacting  only  the 
primitive  cells  in  a  library  instead  of  fully  assembled  structures  the  com¬ 
paction  effort  is  not  duplicated  over  the  various  replication  factors  in  the 
layout.  For  example  if  a  cell  A  appears  a  hundreds  time  in  a  layout,  a  com¬ 
pactor  operating  on  the  final  layout  (where  A  appears  one  hundred  times) 
would  be  more  computationally  expensive  than  one  which  cleverly  compacts 
the  cell  A  only  once.  Also  the  compaction  may  only  be  performed  once  for 
a  given  set  of  design  rules  (and  other  constraints  such  as  bus  and  device  su¬ 
ing)  instead  of  running  the  compactor  on  each  new  structure  created  (by  the 
RSG).  These  two  factors  (i.e.  the  compaction  effort  not  being  duplicated  over 
the  various  replication  factors  and  also  the  compaction  being  performed  only 
once  and  not  on  each  structure  generated)  can  lead  to  orders  of  magnitude 
improvements  in  computation  costs,  perhaps  allowing  implementations  previ¬ 
ously  thought  of  as  too  computationally  costly  (such  as  for  instance  simulated 
annealing[16]). 

The  costs  associated  with  a  leaf  cell  compactor  are: 

1)  Perhaps  a  more  complex  compactor. 

2)  After  compaction  all  instances  of  a  cell  A  in  the  final  layout  have  exactly 
the  same  geometry.  In  the  case  of  a  classical  compactor  which  first  flattens 
the  layout  (gets  rid  of  the  cell  hierarchy)  before  compacting  it,  circuitry 
which  used  to  belong  to  instances  of  A  may  end  up  having  different  layout 
geometries. 


The  relaxation  of  the  constraint  that  all  instances  of  A  have  the  same 
geometry  can  potentially  lead  to  more  optimal  layouts.  Howeyer  in  the  case 
of  highly  regular  structures  with  large  replication  factors,  what  goes  on  along 
the  boundary  of  arrays  of  cells  has  a  negligeable  impact  on  the  total  size 
of  the  layout.  Most  of  the  cells  in  a  large  structure  are  far  away  from  the 
boundaries  of  the  array  (assumed  for  simplicity  sake  to  be  an  array  of  identical 
cells)  anyway  and  hence  geometrical  constraints  on  each  of  them  can  be  nearly 
identical  since  the  constraints  caused  by  the  boundary  of  the  array  can  be 
attenuated.  Hence  the  constraint  that  the  layout  of  all  the  instances  of  A  be 
identical  after  compaction  may  not  be  too  restrictive.  Furthermore  assuming 
that  compactors  are  not  perfect  and  do  from  time  to  time  produce  legal  but 
electrically  poor  layout,  quality  control  of  the  compactor  output  can  more 
easily  be  performed  on  a  library  of  a  few  cell  than  on  each  of  the  large 
layouts  generated  by  an  RSG  type  generator. 

At  this  point  let  us  take  a  step  back  and  examine  the  real  motivation  be¬ 
hind  a  leaf  cell  compactor  and  the  motivation  behind  a  classical  compactor, 
since  they  differ  in  essence.  A  good  classical  compactor  should  be  able  to 
start  with  a  stick  diagram  or  a  very  poorly  designed  starting  layout.  From 
this  poor  starting  point  the  compactor  should  be  able  to  investigate  differ¬ 
ent  compaction  options  in  order  to  find  an  optimal  (or  satisfactory)  layout. 
Unfortunately  for  a  given  electrical  functionality,  the  space  of  legal  layouts  is 
not  convex.  This  means  that  if  we  use  a  model  where  we  continuously  deform 
the  starting  layout  in  search  of  a  more  optimal  one  (while  keeping  the  layout 
legal  at  all  times)  we  might  have  to  shrink  as  well  as  expand  the  layout  as 
we  move  along  a  path  leading  to  an  optimal  solution.  Therefore  a  greedy 
algorithm  which  looks  only  for  a  local  minima  can  fail  to  find  very  profitable 


optimizations  which  require  hill  climbing  (moving  temporarily  in  a  direction 
leading  to  to  a  less  optimal  layout).  One  dimensional  compactors  which  com* 
pact  in  one  dimension  at  a  time  are  an  example  of  greedy  optimizations  which 
do  not  lead  to  the  optimal  solution.  A  one  dimensional  compaction  algorithm 
tries  to  greedily  optimize  one  dimension  at  a  time  and  misses  out  on  the  op* 
timizations  that  require  a  more  careful  analysis  of  the  interaction  between 
the  two  dimensions.  Besides  the  fact  that  the  space  of  legal  layout  may  not 
be  convex  it  may  also  not  be  connected.  In  order  to  reach  an  optimum  by  a 
continuous  deformation  from  the  initial  layout  one  might  have  to  deform  the 
layout  along  a  path  parts  of  which  do  not  correspond  to  legal  layouts. 


The  motivation  behind  a  leaf  cell  compactor  is  to  be  able  to  transform 
cells  from  one  technology  to  another  and  also  to  be  able  to  size  busses  and 
devices.  The  cells  already  existing  in  the  library  can  be  assumed  to  be  highly 
optimized  for  the  technology  in  which  they  are  designed  and  there  is  a  good 
chance  that  the  topology  of  the  initial  layout  can  be  used  as  a  good  starting 
point  for  the  target  technology  into  which  we  are  going  to  compact  the  cells. 
Under  these  assumptions  the  minima  (of  the  objective  function)  has  a  better 
chance  to  be  reached  by  a  greedy  type  algorithm  that  searches  for  a  local 
minima..  Hence  some  of  the  inherent  difficulties  in  leaf  cell  compaction  can 
be  offset  by  the  previous  simplifying  assumptions  on  the  initial  starting  layout 
(namely  that  the  cells  in  the  library  can  be  assumed  to  be  designed  carefully 
and  the  easier  quality  control  of  the  output)  making  the  task  of  designing 
such  a  compactor  a  more  manageable  one. 
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6.2  Defining  a  cost  function 

The  purpose  of  this  section  is  to  show  the  importance  and  raise  some  of 
the  issues  related  to  defining  a  layout  cost  function  for  a  leaf  cell  compactor. 
The  cost  function  is  an  evaluation  of  the  goodness  of  the  layout  and  the 
compactor’s  goal  is  to  produce  the  layout  with  the  lowest  cost  subject  to  a 
set  of  constraints.  Defining  a  cost  function  for  a  leaf  cell  compaction  scheme 
is  not  as  straightforward  as  it  is  in  the  case  of  a  simple  compactor.  Also  the 
impact  of  the  chosen  cost  function  on  the  final  layout  (variations  in  the  final 
layouts  produced  using  different  cost  functions)  may  be  greater  than  would 
be  the  case  in  simple  compaction. 

Figure  6.1  shows  a  structure  consisting  of  a  linear  array  of  cells.  The  m 
rightmost  cells  are  of  type  A  and  have  pitch  A0,  the  n  leftmost  cells  are  of 
type  B  and  have  pitch  A*.  It  can  be  shown  that  in  the  general  case  (if  there 
are  constraints  between  A  and  B  other  than  those  shown  in  Figure  6.1)  there 
are  tradeoffs  between  minimizing  A„  and  Afc.  Aa  can  be  minimized  to  a  greater 
extent  at  the  cost  of  increasing  A*  and  vice  versa.  Let  us  consider  an  extremely 
simple  cost  function  for  simple  compaction  and  try  to  find  a  corresponding 
cost  function  in  the  case  of  leaf  cell  compaction.  Let  the  cost  function  be 
X ,  the  x  dimension  size  of  the  layout  (for  simplicity  sake  assume  that  the  y 
coordinates  are  fixed).  Finding  an  optimal  Aa  and  A*  (given  the  geometric 
constraints)  so  as  to  minimize  X,  depends  on  the  replication  parameters  n  and 
m.  However  in  a  leaf  cell  compactor  n  and  m  are  not  known  at  compaction 
time.  Hence  the  user  has  to  explicitly  provide  a  cost  function  in  terms  of  Aa 
and  A*  (as  well  as  other  parameters)  based  on  empirical  estimates  of  what 
n  and  m  are  expected  to  be.  In  the  case  where  n  and  m  are  large  numbers 
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Figure  6.1:  Denning  a  cost  function. 


X  ss  nAa+mAfr,  therefore  minimizing  Aa  and  A&  is  much  more  important  than 
minimizing  the  sizes  of  the  cells  themselves.  For  a  given  A0  and  Afc  (assume 
for  simplicity  sake  that  the  lab  interface  is  fixed)  reducing  the  size  of  A  and 
B  has  only  a  marginal  impact  on  X  because  it  effects  only  the  extremities 
of  the  array,  since  its  impact  is  independent  of  the  replication  factors  n  and 
m.  Hence  the  cost  function  should  depend  essentially  on  Aa  and  A*  and  to  a 
much  lesser  extent  on  the  physical  sizes  of  the  cells  themselves. 

The  remainder  of  this  section  describes  a  layout  example  where  the  pitches 
A,  between  the  cells  do  in  fact  have  to  be  traded  off.  Figure  6.2(a)  shows  three 
instances  of  a  same  cell  A.  The  cell  A  consists  of  two  horizontal  bars.  Since 
the  three  instances  are  all  of  the  same  celltype  the  pitch  between  them  is  the 
x  distance  between  the  left  edges  of  their  bounding  boxes.  This  is  because  the 
x  distance  between  their  respective  points  of  call  and  the  left  edges  of  their 
bounding  boxes  is  the  same  and  hence  cancels  out  in  the  pitch  calculation. 
One  can  reduce  the  A i  pitch  by  moving  the  top  bar  of  the  top  instance  toward 
the  left.  This  causes  the  layout  to  deform  to  the  configuration  of  Figure  6.2 


(b).  Moving  the  top  bar  of  the  topmost  instance  to  the  left  causes  the  bottom 
bar  of  tbe  middle  instance  to  move  to  the  right  increasing  the  pitch  Aj  in  so 
doing. 

Choosing  an  appropriate  cost  function  can  be  facilitated  by  the  knowledge 
of  the  replication  parameters  in  the  structure  to  be  built  from  the  leaf  cells. 
An  optimal  cost  function  for  a  given  set  of  replication  parameters  may  not 
be  optimal  for  another  set  of  parameters.  In  practice,  however,  tradeoffs 
between  the  pitches  may  not  be  as  extreme  as  in  Figure  6.2.  Experimental 
results  are  needed  to  determine  just  how  much  interaction  there  is  between 
the  pitches  of  leaf  cells  that  occur  in  practice.  Making  the  cost  function  linear 
in  the  A*  and  the  box  edge  locations  can  substantially  simplify  the  problem 
of  solving  the  constraint  system  i.e.  finding  a  minmyntn  for  the  cost  function 
subject  tc  the  constraints. 

6.3  Constraint  Representation 

The  purpose  of  this  section  is  to  propose  a  representation  of  the  constraint 
system  in  leaf  cell  compaction.  It  is  assumed  that  the  reader  is  somewhat 
familiar  with  graph  based  constraint  systems.  We  will  restrict  ourselves  to  one 
dimensional  compaction  in  the  z  dimension.  Compacting  in  the  z  dimension 
entails  determining  the  abscissas  of  all  the  vertical  edges  of  the  boxes  in  a 
layout.  Horizontal  edges  play  no  role  in  the  constraint  representation  and  are 
assumed  to  shrink  or  expand  in  response  to  the  displacement  of  the  vertical 
edges.  In  the  case  of  leaf  cell  compaction  the  unknowns  of  the  problem  are 
the  abscissa  of  the  vertical  edges  of  boxes  in  the  leaf  cells,  as  well  as  the 
A<  which  axe  the  z  dimension  pitches  between  the  various  cells.  The  known 


parameters  are  the  design  rules  of  the  process,  the  sizing  constraints  that  arise 
from  electrical  considerations  and  the  electrical  network  implicit  in  the  initial 
layout.  The  constraints  that  arise  from  the  interaction  of  the  parameters  can 
be  represented  by  a  constraint  graph  whose  vertices  correspond  to  vertical 
edges  of  boxes  in  the  layout.  The  edges  between  the  vertices  in  the  graph 
correspond  to  minimum  spacing  constraints  between  the  objects  represented 
by  the  vertices.  The  weights  on  the  edges  of  the  graph  are  the  actual  values 
of  the  minimum  permissible  distances  between  the  vertices. 

A  possible  strategy  for  leaf  cell  compaction  is  to  build  a  constraint  graph 
for  each  of  the  leaf  cells  and  then  include  the  constraints  arising  from  the 
interaction  of  the  ceils  by  adding  new  edges  between  the  graphs.  The  resulting 
graph  (formed  by  the  union  of  the  leaf  cell  constraint  graphs  and  the  new 
edges)  has  2  kinds  of  constraints:  intra  cell  constraints  (constraints  within 
a  cell)  and  inter  cell  constraints  (constraints  from  the  interaction  between 
cells).  Both  intra  cell  and  inter  cell  constraints  can  be  extracted  from  an 
RSG  sample  layout.  The  intra  cell  constraints  can  be  extracted  from  the  cell 
definitions  of  the  leaf  cells  in  the  sample  layout.  Inter  cell  constraints  can  be 
determined  from  the  various  cell  interfaces  present  in  the  sample  layout.  After 
the  compaction  is  completed,  it  is  possible  to  build  a  new  sample  layout  for 
the  new  technology  and  electrical  constraints,  from  the  new  cell  definitions  of 
the  leaf  cells  and  the  new  pitch  parameters  (both  of  which  were  the  unknowns 
of  the  initial  compaction  problem).  Recall  from  Section  3.1  that  the  sample 
layout  does  not  necessarily  have  to  contain  all  the  possible  interfaces  that 
might  occur  in  a  final  layout  (because  the  RSG  connectivity  graph  need  only 
be  a  spanning  tree).  However  if  a  sample  layout  is  to  be  used  for  leaf  cell 
compaction,  then  in  order  for  the  compactor  to  generate  all  the  required 


inter  cell  constraints  it  is  imperative  that  all  possible  interfaces  that  might 
arise  in  the  final  layout  be  present  in  the  sample  layout.  The  next  paragraph 
describes  how  these  constraints  can  be  generated  in  the  very  simple  case 
where  the  sample  layout  contains  1  cell  and  1  interface. 

Figure  6.3  shows  two  instances  of  A  interfaced  together.  A  is  a  cell  con¬ 
taining  four  vertical  (box)  edges.  The  left  (respectively  right)  instance  of  A  as 
well  as  the  corresponding  1,2,3, 4  (respectively  l',2',3',4')  constraint  graph 
and  the  edges  in  the  graph  are  shown  in  solid  (respectively  dotted)  line.  In¬ 
ter  cell  constraints  between  the  two  instances  arising  from  the  existence  of 
the  Iaa  interface  are  shown  in  broken  line.  If  compaction  was  performed  on 
the  1,2,3, 4,1',  2',  3',  4'  graph,  the  compacted  layouts  of  the  two  instances  of 
A  may  not  be  identical.  The  unknowns  of  the  problem  are  the  abscissa  of 
the  four  vertical  edges  in  the  cell  (and  not  the  instances  of)  A  and  the  pitch 
Aa  after  compaction.  We  must  express  the  constraint  system  in  terms  of  a 
graph  where  the  vertices  are  the  vertical  box  edges  of  A  and  the  weights  are 
functions  of  Aa.  This  will  ensure  that  both  instances  of  A  in  the  compacted 
layout  have  the  same  geometries.  Since  the  pitch  between  the  two  instances 
is  Aa  the  distance  between  the  1  and  the  1'  node  is  necessarily  A,.  Hence 
since  node  4  must  be  x4  to  the  left  of  node  1'  it  must  be  x4  —  A„  to  the  left 
of  1.  Therefore  we  can  replace  the  dashed  edge  weighted  by  x4  by  an  edge 
from  node  4  to  node  1  weighted  by  x4  -  Au.  Similarly  we  can  replace  the 
edge  between  node  4  and  node  3'  weighted  by  x5  by  an  edge  between  node 
4  and  node  3  weighted  by  zs  -  A0.  Once  this  edge  replacement  is  complete 
we  can  discard  the  l',2',3',4'  graph  and  all  edges  terminating  on  vertices  of 
that  graph.  We  are  then  left  with  the  1,2,3, 4  graph  where  the  edges  drawn 
with  straight  lines  are  intra  cell  constraints  and  edges  drawn  with  arcs  are  the 


Figure  6.3:  Constraint  representation. 


inter  cell  constraints.  The  new  constraint  system  ensures  that  both  instances 
of  A  will  have  the  same  geometries  and  at  the  same  time  reduces  the  number 
of  unknowns  from  8  (the  abscissas  of  1, 2, 3, 4, 1', 2', 3', 4')  to  5  (the  abscissas 
of  1, 2, 3, 4  and  Aa).  In  the  case  of  larger  cells  and  multiple  interfaces,  the 
reduction  in  the  number  of  unknowns  can  be  be  much  more  substantial  since 
only  one  new  unknown  (a  A,  pitch  parameter)  is  added  for  each  new  interface. 

This  graph  constraint  system  cannot  be  solved  by  shortest  path  algo* 
rithms  such  as  Bellman  Ford[17j  because  the  weights  on  the  edges  are  not 
all  constants.  Some  of  the  weights  depend  on  the  A<  which  must  also  be  de¬ 
termined.  Algorithms  such  as  the  Bellman  Ford  algorithm  are  used  to  solve 
a  system  of  linear  equations  where  there  are  only  (at  most)  two  unknowns 
per  equation.  Such  systems  can  be  represented  by  a  constraint  graph  with 
constant  weight  edges.  However  (if  the  abscissas  of  the  vertices  1,2,3, 4  are 
Xi,X2,Xz,Xi)  in  the  resulting  graph  of  Figure  6.3  the  edge  between  node  4 


and  node  1  represents  the  equation  X\  -  X4  <  *4  -  Att  where  Xu  and  A# 
are  unknowns.  A  simple  minded  way  to  solve  the  system  would  be  to  convert 
the  graph  to  a  system  of  linear  equations  and  solve  the  system  of  equations 
using  a  linear  programming  algorithm  like  Simplex  [10].  Since  we  know  that 
there  are  tradeoffs  between  the  A,  we  will  have  to  define  a  cost  function  that 
is  to  be  minimized  subject  to  the  above  set  of  constraints. 


6.4  Experiments  in  compaction 

Over  one  hundred  and  thirty  kilobytes  of  code  have  been  written  in  order 
to  build  an  experimental  compactor  with  the  intent  of  modifying  it  to  ulti¬ 
mately  do  leaf  cell  compaction.  One  third  of  the  compactor  code  deals  with 
maintaining  and  manipulating  the  data  structures  (such  as  scan  lines  sorted 
lists  etc..)  required  by  the  constraint  generation  process.  This  is  where  most 
of  the  CPU  time  is  spent.  One  fourth  of  the  code  embeds  the  decision  mak¬ 
ing  process  of  determining  what  type  of  constraint  is  appropriate  between 
a  pair  of  box  edges.  This  part  of  the  code  proved  to  be  the  most  convo¬ 
luted,  the  hardest  to  write  and  debug  and  also  the  most  error  prone.  The 
actual  constraint  solving  routine  (a  modified  Bellman  Ford  Algorithm:  see 
Subsection  6.4.2)  is  only  slightly  over  a  page  in  length.  The  rest  of  the  code 
is  overhead  and  consists  of  layout  manipulating  routines,  design  rule  tables 
etc..  The  speed  of  the  compactor  compares  favorably  with  other  compactors 
and  the  output  quality  can,  depending  on  the  input  layout,  be  reasonably 
good.  However  for  a  large  complex  layout  the  compactor  will  often  produce 
a  legal  layout  where  small  regions  of  the  layout  are  electrically  poor,  making 
hand  checking  (and  minor  modifications)  of  the  result  a  necessity. 


83 


While  the  general  methods  and  mathematical  foundations  of  the  com¬ 
paction  problem  are  well  understood  they  seem  inadequate  to  deal  with  the 
myriad  of  special  cases  encountered  in  practice.  Whether  commercial  com¬ 
pactors  function  properly  in  a  realistic  VLSI  setting  is  still  an  open  question 
for  me  as  I  did  not  have  a  compactor  with  which  to  compare  results  readily 
available  to  me.  However  I  believe  that  my  compactor  would  compare  fa¬ 
vorably  on  many  of  the  examples  found  in  compaction  papers.  Rather  than 
laboriously  go  through  the  quagmire  of  designing  and  implementing  a  rea¬ 
sonable  compactor,  I  will  skim  through  some  of  the  salient  difficulties  and 
in  some  cases  propose  solutions  to  the  problems  I  encountered.  Many  of  the 
classical  difficulties  of  compaction  are  explained  in  [31]. 

The  rest  of  this  section  is  for  the  benefit  of  whomever  continues  the  com¬ 
pactor  project,  it  describes  three  major  difficulties  (encountered  during  the 
compactor  project)  which  can  be  corrected  by  a  more  appropriate  choice  of 
strategy.  Its  intent  is  not  to  give  an  overview  of  the  compaction  problem.  The 
compactor  used  a  one  dimensional  graph-based  constraint  method  where  the 
vertices  in  the  graph  represent  layout  box  edges1.  Other  one  dimensional 
techniques  include  shear  line  compaction  [9]. 

6.4.1  Constraint  generation 

One  of  the  purposes  of  the  compactor  is  to  perform  device  and  bus  sizing. 
Device  and  bus  sizing  requires  the  ability  to  tag  (identify)  the  particular 
devices  (or  buses)  to  be  sized  in  the  layout.  This  can  be  accomplished  by 
making  the  bus  (or  the  gate  and  channel  of  the  device)  to  be  sized,  a  cell. 

lthe  edges  are  vertical  since  it  is  assumed  throughout  this  section  that  compaction  is  being 
performed  in  the  x  dimension. 


The  compactor  can  then  size  ail  instances  of  that  cell  according  to  some 
user  defined  specification.  In  some  processes  transistor  gates  mnst  be  wider 
than  the  minimum  poly  width.  This  can  be  achieved  by  making  the  gates  of 
transistors  instances  of  a  particular  cell.  The  compactor  must  then  make  all 
instances  of  that  cell  a  certain  minimnm  size.  Finally  there  may  be  critical 
parts  of  the  layout  (such  as  sense  amplifiers)  which  must  be  left  unchanged 
by  the  compactor.  This  also  can  be  achieved  by  making  those  portions  of  the 
layout  (to  be  kept  frozen ),  out  of  cells  which  the  compactor  will  know  how 
to  handle. 

Many  compactors  first  perform  a  preprocessing  phase  on  the  layout  Dur¬ 
ing  this  preprocessing  phase  boxes  of  the  same  layer  are  merged  together.  For 
example  EXCL  uses  a  merging  technique  (although  not  for  compaction)  which 
gets  rid  cf  redundant  vertical  edges  of  boxes.  After  the  merging  process  is 
complete  each  layer  of  the  layout  consists  of  nonoverlapping  boxes  such  that 
each  box  has  the  largest  possible  z  dimension  size  (as  a  result  of  this  there 
are  no  hidden2  or  partially  hidden  vertical  edges). 

Merging  boxes  considerably  reduces  the  constraint  generation  problem. 
Figure  6.4  shows  two  boxes  of  a  same  layer  (in  solid  line).  The  existence  of  a 
minimum  spacing  constraint  between  the  right  edge  of  the  left  box  and  the 
left  edge  of  the  right  box  depends  on  the  presence  of  the  middle  box  (shown  in 
broken  line)  whose  presence  masks  the  two  previous  edges.  Always  generating 
the  constraint  between  those  two  edges  (regardless  of  the  presence  of  the 
middle  box)  can  substantially  overconstrain  the  system.  Consider  a  piece  of 
diffusion  fragmented  into  n  abbuting  boxes  as  in  Figure  6.5.  Indiscriminately 


Figure  6.4:  Constraint  for  hidden  edges 
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Figure  6.5:  Fragmented  Layout 

generating  constraints  between  left  edges  end  right  edges  would  force  the  x 
size  of  the  final  layout  be  at  least  nX  where  A  is  the  minmnim  spacing  for 
diffusion.  Merging  the  boxes  into  one  box  would  get  rid  of  the  fragmentation 
and  allow  the  layout  to  shrink  to  the  minimum  width  for  diffusion. 

Unfortunately,  due  to  the  device  and  bus  sizing  mechanism  in  the  com¬ 
pactor,  it  is  not  possible  to  perform  merging  on  the  boxes.  Merging  boxes 
causes  loss  of  information  relating  to  which  cells  the  boxes  came  from.  A 
long  bus  might  require  to  be  wider  in  certain  regions.  These  regions  can  be 
identified  by  the  compactor  as  being  part  of  certain  cells.  Merging  the  boxes 
in  the  bus  of  Figure  6.5.  would  cause  the  loss  of  that  information  since  after 
the  merging  process  there  is  only  one  box  for  the  whole  bus.  This  constraint 
(i.e.  merging  being  unacceptable)  combined  with  the  wrong  constraint  gener¬ 
ation  technique  made  constraint  generation  an  extremely  hard  problem.  The 
main  problem  is  to  generate  enough  constraints  so  that  the  result  is  a  legal 
layout  without  overconstraining  the  system,  which  degrades  the  quality  of 


the  result. 

The  minimal  constraint  set  is  not  unique  (A  minimal  constraint  set  is 
such  that  removing  any  constraint  from  it  may  cause  the  resulting  layout  to 
become  illegal)  and  therefore  it  is  not  possible  to  reach  the  optimal  constraint 
set  simply  by  removing  overconstraining  constraints.  Generating  a  good  con* 
strain t  set  is  a  particularly  hard  problem.  Substantial  gain  in  output  quality 
can  be  made  by  simply  making  the  constraint  generator  smarter  without 
having  to  go  to  a  more  complex  compaction  strategy  as  in  two  dimensional 
compaction  [15]. 

Most  graph  based  compactors  use  a  scan  line  technique  for  the  generation 
of  constraints.  Other  reasonable  ways  of  generating  constraints  include  walk¬ 
ing  through  a  layout  database  as  in  MAGIC  where  each  box  (tile)  has  pointers 
to  its  neighbors  (comer  stitching).  There  are  essentially  two  possible  ways 
to  perform  scanning.  The  way  it  was  performed  in  the  compactor  was  using 
a  scan  line  which  represents  a  slice  through  the  layout3.  Constraints  in  the 
x  dimension  are  generated  with  a  horizontal  scan  line  that  moves  vertically. 
At  any  given  time  the  scan  line  holds  the  part  of  the  layout  that  intersects 
its  current  y  position4.  Only  objects  that  were  in  the  scan  line  at  the  same 
time  can  have  a  constraint  between  them.  If  the  current  scan  line  location 
intersects  the  piece  of  diffusion  in  Figure  6.5  then  all  the  boxes  in  the  Figure 
are  simultaneously  present  in  the  scan  line.  The  constraint  generator  must 
then  examine  each  pair  of  vertical  edges  and  determine  what  constraint  to 
put  between  them.  In  order  to  determine  the  appropriate  constraint  between 
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Figure  6.6:  Constraint  between  partially  hidden  edge 


a  pair  of  edges,  the  constraint  generator  has  to  shuffle  through  the  objects  in 
the  scan  line  to  examine  the  relevant  neighboring  objects.  This  turns  out  to 
be  one  of  the  most  difficult  and  critical  parts  of  the  compactor.  A  smart  com¬ 
pactor  must  at  least  notice  that  some  of  the  edges  might  be  hidden  and  that 
it  may  not  be  appropriate  to  put  a  constraint  between  them.  Deciding  on  an 
appropriate  constraint  is  not  a  straightforward  task.  In  Figure  6.6  the  right 
edge  of  the  leftmost  box  and  the  left  edge  of  the  rightmost  box  are  hidden 
when  the  scan  line  is  at  location  yi-  However  when  the  scan  line  reaches  yi 
the  edges  axe  no  longer  hidden  and  therefore  the  constraint  generator  must 
place  a  constraint  between  the  two  edges. 

By  selecting  a  more  appropriate  scanning  technique  it  is  possible  to  elim¬ 
inate  part  of  the  hidden  edge  problems.  The  scan  line  can  be  a  vertical  line 
that  sweeps  from  -oo  to  H-oo  (we  are  still  generating  constraints  for  the  x 
dimension).  The  scan  line  contains  information  of  what  a  viewer  on  the  scan 
line  looking  toward  the  left  would  see.  In  Figure  6.7  the  viewer  on  the  scan 
line  would  see  the  Z2,  X3  segment  of  the  left  box  and  will  see  the  ii,  xj  segment 
as  belonging  to  the  insides  of  the  right  box.  Constraints  are  placed  between 
what  the  viewer  can  see  in  the  scan  line  and  the  objects  that  currently  inter¬ 
sect  the  scan  line.  More  details  on  this  scan  line  technique  and  relevant  data 


sot 


Figure  6.7:  Correct  scan  line  method 

structures  can  be  found  in  [11]  and  [24]  s.  The  advantage  of  this  method  is 
that  hidden  edges  are  automatically  taken  care  of  because  they  do  not  show 
up  in  the  scan  line.  Hence  merging  of  boxes  is  implicitly  taken  care  of. 

6.4.2  Solving  the  Constraint  System 

The  Bellman  Ford  algorithm  [17]  was  used  to  solve  the  graph  based  con* 
straint  system.  The  Bellman  Ford  assigns  to  each  vertex  the  lowest  possible 
abscissa  subject  to  the  constraints.  The  algorithm  proved  to  be  extremely 
fast,  especially  if  the  edges  are  traversed  in  sorted  (according  to  their  ab¬ 
scissa)  order,  i.e.  a  preliminary  sort  on  the  edges  according  to  their  abscissa 
in  the  initial  layout  is  performed.  This  is  because  the  initial  ordering  of  the 
edges  is  a  good  estimate  for  the  final  ordering.  Going  through  the  edges  in 
a  suitable  order  considerably  reduces  the  number  of  Bellman  Ford  relaxation 
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Figure  6.8:  Worsening  of  a  layout  Jog 

steps.  In  the  case  where  the  initial  ordering  is  preserved  in  the  final  layout 
exactly  one  relaxation  step  is  required  instead  of  the  |  E  |  (where  |  E  j  is  the 
number'of  vertices  in  the  constraint  graph)  required  in  the  worst  case.  Un¬ 
fortunately  while  Bellman  Ford  does  a  good  job  of  minimizing  the  total  size 
(bounding  box)  of  the  layout  it  can  generate  electrically  poor  layouts.  This 
is  because  although  the  algorithm  minimizes  the  longest  path  it  can  actually 
increase  the  length  of  other  paths  (up  to  the  length  of  the  longest  path). 

The  Bellman  Ford  algorithm  consists  of  pushing  all  the  objects  in  a  layout 
as  much  to  the  left  as  they  can  go  subject  to  the  constants.  When  applied 
to  the  layout  of  Figure  6.8(a)  the  resulting  layout  of  Figure  6.8(b)  develops  a 
jog  in  it.  A  more  appropriate  algorithm  would  be  one  that  tries  to  bring  all 
objects  close  together  as  if  they  were  all  connected  by  rubber  bands  instead 
of  trying  to  move  them  all  to  one  side  as  if  they  are  being  attracted  by  a 
large  magnet  on  the  left. 
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6.4.3  Dealing  with  layer  Interaction 


Some  design  rules  such  as  those  for  contacts  or  gates  are  hard  if  not 
impossible  to  express  in  terms  of  minimum  spacing  constraints  between  the 
mask  layers  of  a  layout.  These  kind  of  constraints  often  occur  due  to  the 
interaction  of  several  layers  at  a  time.  For  example  the  width  of  poly  may 
be  3A  except  over  diffusion  (gate  of  a  transistor)  where  it  might  have  to  be 
5A.  Not  knowing  beforehand  where  in  the  compacted  layout  poly  will  end  up 
over  diffusion  it  is  hard  to  determine  which  regions  of  poly  should  have  a  5A 
width  constant  on  them.  This  is  because  constraints  are  generated  based  on 
the  initial  layout  whose  topology  will  change  during  compaction. 

One  way  of  solving  this  class  of  problems  is  to  create  new  layers  that  do  not 
correspond  to  actual  mask  layers  in  the  lithographic  process.  This  method  is 
already  used  in  editors  such  as  Magic  [26].  For  example  Magic  has  a  special 
layer  called  contact  which  has  design  rules  similar  to  those  of  any  other  layer. 
This  special  layer  is  comprised  of  metal,  poly  and  the  actual  contact  cut  (or 
cuts)  between  them.  At  mask  creation  time  the  contact  layer  is  converted 
into  actual  lithographic  mask  layers  which  may  contain  one  or  several  contact 
cuts  depending  on  the  size  of  the  contact  layer.  The  appropriate  metal  and 
poly  overlaps  as  well  as  the  size  and  spacing  of  the  contact  cuts  can  be  looked 
up  in  a  table.  Figure  6.9  shows  an  example  of  what  this  translation  process 
when  applied  to  a  large  contact  layer  might  look  like.  The  same  type  of 
strategy  can  be  used  for  transistors,  buried  contacts,  etc..  The  benefit  of  this 
strategy  is  that  often  the  new  layers  that  result  from  the  interaction  of  several 
primitive  layers  can  be  characterized  by  simple  design  rule  constraints  while 
as  the  interaction  of  the  different  layers  often  can  not. 
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Figure  6.9:  Contact  layer  Expanded 

6.5  Summary  and  new  directions 

In  this  chapter  some  of  the  benefits  and  difficulties  of  leaf  cell  compaction 
have  been  explored.  A  constraint  representation  for  leaf  cell  compaction  has 
also  been  proposed.  Difficulties  encountered  during  the  design  and  imple¬ 
mentation  of  an  experimental  compactor  (a  flat  layout  compactor)  have  been 
described  and  improvements  have  been  suggested.  The  rest  of  this  section 
describes  a  plausible  sequence  of  steps  leading  to  the  implementation  and 
evaluation  of  an  efficient  leaf  cell  compactor. 

Section  6.4.3  relates  the  problems  of  dealing  with  layer  interaction.  This 
problem  occurs  because  design  rules  arising  from  layer  interaction  cannot  be 
described  in  terms  of  minimum  spacing  constraints.  A  successful  compactor 
must  be  built  on  top  of  underlying  mechanisms  for  transforming  a  set  of 
physical  mask  layers  into  special  layers  as  prescribed  by  Section  6.4.3,  and 
transforming  these  special  layers  back  into  physical  layers.  A  flexible  con¬ 
straint  generator  (for  flat  layout  compaction)  implementing  the  right  kind  of 
scanning  technique  and  a  carefully  constructed  set  of  constraint  generation 
rules  must  be  built.  The  ultimate  goal  is  to  modify  the  constraint  generator 
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to  do  leaf  cell  constraint  generation.  Provisions  for  interfacing  the  constraint 
generator  to  a  device  sizing  tool  such  as  [22]  must  be  considered.  Care  must 
be  taken  not  to  underestimate  the  difficulty  inherent  in  constraint  generation, 
and  a  carefully  charted  course  must  be  generated  before  any  actual  code  is 
written.  Testing  the  constraint  generator  for  larger  than  simple  test  cases 
cannot  be  accomplished  without  building  a  throw-away  test  constraint  solver 
(for  flat  compaction).  The  constraint  solver’s  purpose  will  be  to  facilitate 
testing  of  the  constraint  generator  by  outputting  actual  compacted  layouts 
instead  of  constraint  graphs.  Once  testing  is  completed  the  constraint  genera¬ 
tor  must  be  modified  to  do  leaf  cell  compaction  and  an  appropriate  constraint 
solving  algorithm  for  leaf  cell  compaction  must  be  selected  or  developed.  The 
effects  of  different  cost  functions  on  the  new  leaf  cell  compactor  must  be  eval¬ 
uated  and  catalogued.  Finally  an  exploration  of  how  the  compactor  and  the 
RSG  can  together  constitute  an  efficient  layout  module  in  a  larger  silicon 
compilation  system  must  be  investigated. 
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Chapter  7 


Conclusion 


The  push  to  design  larger  and  more  complex  VLSI  chips  has  sparred  the 
creation  of  more  sophisticated  design  tools.  By  restricting  the  target  a r- 
chitecture  to  designs  that  are  regular  and  can  be  algorithmically  described, 
efficient  and  flexible  layout  generators  that  fnnction  well  in  a  realistic  VLSI 
setting  can  be  built.  Regularity,  however,  does  not  exclude  complexity  in 
the  personalization  of  these  structures.  This  thesis  has  demonstrated  the 
importance  of  the  appropriate  abstraction  mechanisms  —  macrocells ,  inter¬ 
faces,  and  interface  inheritance  —  in  generating  layouts  for  realistic  regular 
structures.  The  RSG  is  an  operational  tool  that  supports  true  macro  ab¬ 
straction  and  inheritance.  Due  to  the  flexible  target  architecture,  greater 
generality  than  specialized  module  compilers  can  be  achieved  without  the 
loss  of  efficiency  incurred  in  silicon  compilers  with  a  fixed  target  architec¬ 
ture.  The  RSG  presents  a  convenient  interface  to  the  user  by  separating  the 
graphical  and  procedural  description  of  a  circuit  along  a  natural  boundary, 
making  it  an  extremely  easy  tool  to  utilize,  extend,  and  upgrade.  Information 
is  efficiently  partitioned  into  a  design  file  which  describes  the  global  layout 
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connectivity  and  a  sample  file  which  specifies  the  local  placement  constraints 
and  the  specifics  of  the  primitive  cells.  Tangible  proof  of  the  efficiency  and 
applicability  of  the  RSG  method  to  intricate  regular  structures  that  arise 
in  meaningful  applications  was  demonstrated  by  the  design  of  a  (class  of) 
pipelined  multiplier.  The  RSG’s  power  can  be  farther  enhanced  by  a  special 
kind  of  compactor  which  will  make  the  RSG  technology  transportable  and 
allow  it  to  perform  device  and  bus  sizing.  The  simple  mechanisms  used  in 
the  RSG  can  be  easily  embedded  in  a  complete  VLSI  design  system.  Such 
a  design  system  would  include  placement  and  routing  and  also  compilation 
from  a  functional  specification.  The  RSG  could  then  be  an  efficient  link  in 
the  design  chain  from  functional  specification  to  silicon. 


Appendix  A 
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Appendix  B 
Multiplier  Design  File 


(macro  ocall  (xsiza  ysizs  xloc  yloc) 

(locals  c  loo) 

(mk.instancs  c  corscsll) 

(cond  ((■  xsiza  xloc) 

(cond  ((»  ysizs  yloc) (connsct  c  (mk_instancs  loo  typsl) .tlinum)) 
(trus  (connsct  c  (mk.instancs  loo  typa2)  t2inum)))) 

(trus  (cond  ((■  ysizs  yloc) 

(connsct  c  (mk.instancs  too  typs2)  t2inum) ) 

(trus  (connsct  c  (mk.instancs  loo  typsl)  tlinua))))) 

(cond  ((■  (mod  xloc  2)  0) 

(connsct  c  (mk.instancs  loo  clockl)  clklinum) ) 

(trus  (connsct  c  (mk.instancs  loo  dock2)  clk2inum))) 

(cond  ((■  yloc  ysizs)  (connsct  c  (mk.instancs  loo  top2)  top2inum)) 
(trus  (connsct  c  (mk.instancs  loo  topi)  toplinum)))) 

(macro  mlins  (xsizs  ysizs  currsntlins) 

(locals  1.  rsl) 

(assign  1.1  (mcsll  xsizs  ysizs  1  currsntlins)) 

(sstq  rsl  (subcsll  1.1  c)) 

(do  (i  2  (M  i)  (>  i  xsizs)) 

(assign  l.i  (mcsll  xsizs  ysizs  i  currsntlins)) 

(connsct  (subcsll  l.(-  i  1)  c)  (subcsll  l.i  c)  hinum))) 

(macro  m2darray  (xsizs  ysizs) 

(locals  cl.  topright  bottomright) 

(assign  cl.l  (mlins  xsizs  ysizs  1)) 

(sstq  toprigbt  (subcsll  cl.l  rsl)) 

(do  (i  2  (*  1  i)  (>  i  ysizs)) 

(assign  cl.i  (mlins  xsizs  ysizs  1)) 

(connsct  (subcsll  cl.(-  i  1)  rsl)  (subcsll  cl.i  rsl)  vinnm)) 

(sstq  bottomright  (subcsll  el. ysizs  rsl)) 

(mk_csll  mularraynams  bottomright)) 

(macro  mtoprsgs  (sizs) 

(locals  1.  rsl) 

(assign  1.1  (array  toprsg  1  toprsgvinum) ) 

(sstq  rsl  (subcsll  1.1  c.l)) 

(do  (1  2  (♦  1  i)  (>  1  sizs)) 

(assign  l.i  (arTay  toprsg  i  toprsgvinum)) 

(connsct  (subcsll  l.(-  1  1)  c.l)  (subcsll  l.i  c.l)  toprsgbinum)) 


(mk.cell  topregisters  ref)) 

(macro  mbottomregs  (size) 

(locale  1.  rtf) 

(assign  1.1  (array  bottomreg  size  bottomregvinum) ) 

(aatq  rtf  (subcall  1.1  c.siza  )) 

(do  (i  1  (♦  1  i)  (>  i  siza  )) 

(assign  l.i  (array  bottoorag  (-  (+  1  siza)  i)  bottomregvinum) ) 
(connact  (aubcall  l.(-  i  1)  c.(-  (+  aiza  1)  (-  i  1))) 

(subcall  l.i  c.(-  (♦  1  siza)  i))  bottomreghinum) ) 

(mk.cell  bottomregiaters  ref)) 


(macro  mrigbtrags  (siza) 

(locals  1.  raf  length  regnum) 

(aatq  regnum  (♦  1  (*  3  size))) 

(satq  length  (//  regnum  2)) 

(cond  ((»  (mod  regnum  2)  1)  (satq  length  (♦  1  length)))) 

(assign  1.1  (array  rightreg  length  rightraghinum)) 

(assdirection  1.1  1  length  regnum) 

(satq  raf  (aubcall  1.1  c.l  )) 

(do  (i  2  (M  i)  (>  i  siza  )) 

(assign  l.i  (array  rightrag  length  rigv.treghinum) ) 

(assdirection  l.i  i  length  regnum) 

(connect  (subcall  l.(~  i  1)  c.l) 

(subcell  l.i  c.l)  rightregvinum)) 

(mk.call  rightregisters  raf)) 

(defun  assdirection  (rarray  index  length  regnum) 

(locals  ins  outs  bi  too  doublerag) 

(satq  ins  (*  index  2)) 

(satq  outs  (-  regnum  ins)) 

(satq  bi  (fmin  ins  outs)) 

(cond  ((>  ins  outs)  (prog  (satq  doublerag  inward) 

(satq  singlereg  sinward))) 

(true  (prog  (aatq  doublerag  outward) 

(satq  singlereg  soutward)))) 

(do  (i  1  (♦  1  i)  (>  i  bi)) 

(connect  (mk.instance  too  bidirectional) 

(subcall  rarray  c.i)  rtoregsinum)) 

(connact  (mk.instance  foo  singlereg) 

(subcell  rarray  c.( ♦  bi  1))  rtoregsinum) 

(do  (i  (♦  bi  2)  (♦  i  1)  (>  i  length)) 

(connact  (mk„inatance  foo  doublereg)  (aubcall  rarray  c.i)  rtoregsinum))) 


(macro  mall  (xsizs  yiizt) 

(locals  array! oo  trsgi  brege  rrsgs  tri  arrayi  bri  rri) 

(sstq  rrsgs  (mrightrags  ysizs)) 

(s«tq  brags  (mbottomrsgs  xsizs)) 

(sstq  array! oo  (m2darray  xsizs  ysizs)) 

(sstq  trsgs  (mtoprtgs  xsizs)) 

(dsclara.intarlacs  toprsgistsmams  arraynams  1  (subcsll  trsgs  rsl) 
(subcsll  arrayfoo  toprigbt)  call.to.topraginum) 

(connsct  (mk.instancs  tri  topragistaraama) 

(mk.instancs  arrayi  arraynams)  1) 

(daclara.interlaca  arraynams  bottooragistsmams  1 
(subcsll  array!oo  bottomright) 

(subcsll  brsgs  rs !)  call.to.bottomraginum) 

(connsct  (mk.instancs  bri  bottomrsgistsmams)  arrayi  1) 
(dsclara.intarlacs  arraynams  rightzsgistsmams  1 
(subcsll  array! oo  topright) 

(subcsll  rrsgs  re!)  csll.to.rightrsginum) 

(connsct  (mk.instancs  rri  rightrsgistsmams)  arrayi  1) 

(mk.csll  "all"  array!)) 

(ds!un  !min  (x  y) 

(locals) 

(cond  ((>  x  y)  y) 

(two  x))) 


(mall  xsizs  ysizs) 


Appendix  C 

Multiplier  Parameter  File 


. axaapla.f lie : /u/baaj i/demo/mult . daf 
. conc«pt_f ile : /u/baaj i/deao/ault . con 
. output.! ile : /u/baaj i/daao/aultout . daf 

viaua-2 

hiaua»l 

tliaum*l 

t2inua«l 

aularraynaa#*Marray" 

arraynaa««array 

coracell“call 

typal*tl 

typa2*t2 

clk2iaua«l 

clkliaua«l 

clodcl*clkl 

clock2«*clk2 

topl*toplcal 

top2«top2c*l 

toplinua*l 

top2iaua»l 

topragvinum  ■  2 
topreghinua  ■  1 
toprag  *  tr 

topregiatara  ■  "topraga" 
topragiatamaaa  *  topraga 

bottoaragviaua  *  2 
bottomraghiaua  *  1 
bottoarag  *  br 

bottomragiatara  *  "bottomraga" 
bottoaragiatarnaaa  *  bottomraga 

rlghtragvinua  *  2 
rightreghinua  ■  1 
rigbtrag  *  rT 

rigbtragiatara  *  "rightraga" 
rightragiataraaaa  ■  rigbtraga 
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Appendix  E 
Adder  Cell  Layout 
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